Overview for Datumize Data Dumper (DDD), used to capture network traffic.

Overview

One relevant source for Dark Data is found in network transactions. By doing a careful examination of the network traffic and reconstruction of the conversations, hidden metrics can be non-intrusively recovered.

Datumize Data Dumper (DDD) is a Datumize product aimed at capturing network packets very efficiently at a deep operating system level with minimal-to-no packet loss. It usually works in combination with DDC; DDD manages the segmentation, filtering and temporary persistence of network packets in PCAP files, while DDC efficiently picks-up the segmented files to deliver the further processing. The diagram below represents DDD in action, receiving network packets from the operating systems, capturing, filtering and structuring the output into binary files that will be later processed by Datumize Data Collector (DDC).

Technology

Datumize Data Dumper is a software component that uses tcplib and tcpdump for capturing network packets while in memory of the operating system, apply some filters to select just the traffic needed, and store in packets in PCAP files minimizing the overall packet loss. Some important concepts to keep in mind: 

  • libpcap: library used to intercept packets at operating system level, open source. It works on the user space and works very well from a software perspective. If you need to capture extreme bandwidth, either you go for dedicated hardware (appliance) or use different libraries working at the kernel space; Datumize uses this approach because you can run libpcap in multiple standard operating systems.
  • tcpdump: a very handy capture, filter and store program, open source (releases). Wrapped within DDD with extra goodies.
  • BNF filter: the network filtering syntax, extremely flexible and powerful. Using the Berkeley Packet Filter (BPF) syntax.
  • Operating system: although libpcap is portable and there are Unix and Windows versions, Linux tends to be more robust and minimizes packet loss.
  • PCAP: this binary format supports the storage of network packets for further analysis. DDD supports this format for persisting the selected packets.
  • Storage: the output files are organized in a directory, and pcap output files have different partitioning options.



Configuration

DDD is very sensitive to configuration. Make sure you know what you're doing.

Datumize Data Dumper (DDD) configuration details the different properties you can adjust in the product.


GROUPPROPERTYID

DESCRIPTION

TYPEDEFAULT
DefaultBuffer sizeBUFFSIZE

Memory buffer size in KB. This buffer is used to temporarily hold network packets before applying the filtering and eventually copying the packet to the output. 

  • A big buffer size will spend more memory.
  • A small buffer size will cause packet loss when buffer overflows.
Integer8192
DeviceDEVNetwork device (as shown by ifconfig)Stringeth0
FilterFILTER

PCAP Network Filter as supported by tcpdump. The filter must be quoted, i.e. "tcp and host 192.168.204.24"

String
Snapshot lengthSIZE

Packet snapshot length determines the size of the window (in bytes) used for packet capture. This is a very sensitive property.

Integer262144
Advanced

Rotate secondsROTATE

Rotate output pcap files every number of seconds.

Integer20
OwnerUSEROutput pcap folder and files owner (user:group notation)Stringdatumize:datumize
Staging directoryRAMFSAbsolute path of ramfs (memory filesystem) folder.Path/mnt/ramfs1
Output directoryOUTPUTAbsolute path of pcap output folder.Path/opt/datumize/pcap
Log fileLOGFILERelative or absolute log file pathPath/opt/datumize/log/tcpdump.out
Pcap splitSPLITNUMUse pcap spliter if exist and split into number of times set.Integer10
Sleep on moveSLEEP

Sleep a number of seconds after moving recently close pcap file to output directory.


Integer5
Extra parametersEXTRAUse specific user privileges. Usually used to add -Z root in order to able to don't lose root user when rotating files. String


Advanced Configuration

DDD is automatically installed through Datumize Zentral (DZ) and that should be fine for most configurations.

Important considerations to snapshot length:

  • Big snapshot length decreases the performance of DDD and could generate high amounts of packet losses. The bigger the window, the more CPU cycles to do any processing, filtering and copying.
  • Small snapshot length could yield truncated packets. If the snapshot is smaller than the actual packet size, you will get just the amount of bytes defined in the snapshot.
  • Smaller snapshot might be fine if you just want to analyze packet headers.

Linux system limits can be tweaked for better TCP performance. Check reference1 and reference2.

Memory mounted file (ramfs) is being used by DDD to increase the packet capture and performance.


Using tcpdump from command line

Eventually you might need to use tcpdump from the command line to understand some traffic being captured, decide the proper filter or fine-tune some DDD parameters.

Task

Command

Capture all interfacestcpdump -i any
Show IP instead domain namestcpdump -i <interface-name> -qn tcp -w pcap.pcap
Split large pcap into smaller pcaps (ex. 200MB)tcpdump -r old_file -w new_files -C 200
Flush captured packets and prevent truncated pcap errorstcpdump -U
Handy filterssudo tcpdump -i eth0 -qn udp and dst host <ip_interface> -w pcap_new.pcap
 
sudo tcpdump -q | sed '/ssh/d'; # hide ssh traffic
 
tcpdump -i eth0 port not 22; # hidding with tcpdump filter
     
tcpdump -i eth0 -A | grep HTTP; # with grep
Filtering by protocol and porttcpdump -i eth0 not udp and not arp and port not 22
  
sudo tcpdump -i eth0 -qnn tcp and host 10.150.3.154 and not '(host 10.150.4.106 or host 10.150.4.107)'
Filtering by byte: example hidding traffic with 0 datatcpdump -i eth0 tcp and port 80 and '(((ip[2:2] - ((ip[0]&0xf)<<2)) - ((tcp[12]&0xf0)>>2)) != 0)'
Complex filters (binary and hexadecimal)Check this.
Filter by IP odd and evenEven: tcpdump -i <interface> <options> 'ip[19] & 0x01 = 0 || ip6[39] & 0x01 = 0'
Odd : tcpdump -i <interface> <options> 'ip[19] & 0x01 = 1 || ip6[39] & 0x01 = 1'
Get all the IPs of pcap filetcpdump -r <pcap_file> -qnn tcp | awk '{print $4}' | sort | uniq > ips_hst
tcpdump -r <pcap_file> -qnn tcp | awk '{print $5}' | sort | uniq > ips_dst
tcpdump cheat sheetCheck Packetlife
Wireshark cheat sheetCheck Packetlife