Splunk : Indexing multi-events JSON data

Context

  • Splunk : Splunk light 6.2.6
  • OS : MacOSX 10.10.3 (Yosemite)

Purpose

Index a multi-event JSON file with Splunk properly

Source JSON

I had a single line JSON, pretty print is only to ease the reading

PROPS.CONF file

  • TIME_PREFIX : Regular expression that tells Splunk where to find the date of the event. In my case it was : “Starting with double quotes, followed by ‘create_at’, followed by double quotes, followed by a colon and finally followed by a space”.
  • TRUNCATE : Don’t truncate after reaching a length limit. I needed this because I had I single JSON line.
  • LINE_BREAKER : When parsing the file, when should Splunk decide to create a new event ? In the above example, that would be between ‘}, {‘. Notice that the capture group only contains ‘,’. When reading the documentation, you will find out that everything before capture group is evicted from the final events.
  • SHOULD_LINEMERGE : Don’t merge my events once you’re finishing with the extractions
  • SEDCMD-remove_header: SED expression that will remove all the trash JSON before the ‘comments : []’ section
  • TIME_FORMAT : Instructs Splunk what is the format it should expect when parsing ‘created_at’
  • SEDCMD-add_closing_bracket : Since LINE_BREAKER above is removing our trailing ‘}’, we need to add it back to every event.
  • crcSalt : Instructs Splunk to use the filename instead of the salt in order to estrablish the uniqueness of your file. I don’t need this, because I have no log rotation in place. Therefore it is safe to use the filename.
  • SEDCMD-correctly-close : Since we did a cleanup in the beginning of the file by using SEDCMD-remove_header, we have to cleanup the end of the JSON. In plain english it says : “Replace ‘} ] }’ by ‘} }'”

Result in Splunk

Indexed JSON file on Splunk

Useful links

Leave a Reply

Your email address will not be published. Required fields are marked *