network tools #1

Yet another place-holder for (less popular than the likes of Wireshark – no need to repeat those) tools I have found useful in my years of toying with networking and security:

* a set of of tools, all packaged very nicely (merci, Laurent!), working on either Linux, *BSD, MacOSX (have I ever mentioned that this is my platform of choice ? – having switched from Linux a few years ago) or even Windows: Netwib/ox/ag. Could be used either via a friendly GUI (NetwAG), or simply from the command line (NetwOX). The content of this toolbox is simply amazing!

* one of my all-time favorite set of tools: ntop and nProbe. The first one – an amazing web-based network traffic analyzer, capable of working on either captured traffic mode, and/or in combination with netflow or sflow. The second one is a software probe that could capture traffic and process it in netflow format, to be sent – then – to analyzers of such (I use it extensively where Cisco netflow is not available).

* OpenNMS is a full-blown, open source Network Management solution. Its authors compare it with an enterprise-grade tool like HP OpenView. I personally consider it – alongside Nagios – a fantastic solution for centralized monitoring (a sort of informational portal)

* somehow related to the above, in the category of portal-like monitoring tool, with RRD-based graph trending capabilities is Cacti – another favorite of mine.

* “sitting” in between Cacti and the previous two NMS tools is the ‘Just For Fun’ NMS – which is an SNMP + syslog capable NMS. I have not used it in a long time, but its updated info may convince me to give it another swirl one of these days

google usage … nice

Very nice (and useful!) usage of Google search capabilities for real scientific info.

about fitness

John Wilkins wrote in his blog:

“Fitness is a property of a competing variant in a population. It means that X, whatever it might be biologically, is increasing in its frequency in a population faster than its competing variants. X can be a gene, or a trait, or even an entire organism’s form and functionality.

What fitness isn’t, is something absolute. There’s no universal measure of fitness that applies to all organisms. Every different kind of biological variant is only fit or not compared to the other variants in its population, whether that’s a local population, or a gene pool, or even a smaller group like a kin group (an extended family). Fitness is relative, literally and figuratively.”

… and, in one of his own blog-follow-up comments:

“… I think that fitness is a rate of frequency change, is that it provides a momentum view of fitness – it’s the rate right now at which the frequency of that X is changing, irrespective of what happened in the past or will happen in the future. This is how I get over the generation issue.

cryptolinguistics – well said!

From Matt Blaze’s blog:

We often say that researchers break poor security systems and that feats of cryptanalysis involve cracking codes. As natural and dramatic as this shorthand may be, it propagates a subtle and insidious fallacy that confuses discovery with causation. Unsound security systems are “broken” from the start, whether we happen to know about it yet or not. But we talk (and write) as if the people who investigate and warn us of flaws are responsible for having put them there in the first place.

Words matter, and I think this sloppy language has had a small, but very real, corrosive effect on progress in the field. It implicitly taints even the most mainstream security research with a vaguely disreputable, suspect tinge. How to best disclose newly found vulnerabilities raises enough difficult questions by itself; let’s try to avoid phrasing that inadvertently blames the messenger before we even learn the message.

reading … (#3)

Ever wondered if John Nash’s story hinted also toward some “flavor” of synesthesia? Probably one of my all-time favorite experts in this area of study could clarify the issue …
UPDATE 2008! – Rama did it again!

beyond belief – 2006

Highly recommended Conference – a meeting of the brightest minds …

working with files

Need to sort the first 10 files, by size

$ du -s * | sort -rn | head -10
OR
$ find . -type f -maxdepth 1 -print0 | xargs -0 du -s | sort -rn | head -10

Need to find files greater than 10MB

$ find . -type f -maxdepth 1 -size +10240k

Need to find the most recent 10 files (date of creation)

$ ls -clt | grep ^- | head -10

Need to find the name of all the files in their home directory that have more than 20 characters in their file names

$ printf “%s\n” * | awk ‘length($0) > 20′

… and if you want dot files as well:

$ printf “%s\n” * .* | awk ‘length($0) > 20′
OR
$ ls | sed ‘/.\{21,\}/!d’

******
—– tar specific files (author unknown) —-

!/bin/sh

# Get the directory name to be tared up

echo -e “What is the directory you want to archive”
read archivedir

# Get the name of the archive to create (with tar extension)

echo -e “What is the name of the archive you want to create?”
read archname

# Get the maximum filesize you desire for the archive in bytes
echo -e “What is the maximum size in bytes for a file”
read maxsize

# move to the directory

cd $archivedir

# this is where we create the list of files to be archived.

flist=`ls`

# now test each one to see if it is less than the max for i in $flist; do
# test each file by awking out the size and comparing it
# to our maximum size. If it is smaller it goes into a list
# if it is larger we ignore it. If it is smaller we put it into
# a text file. I’m doing this because I intend to later make this
# able to use more than one directory interactively. Also
# if tar is mapped to tar -f and you can’t use the -A or -r
# switches with -f

filesize=`ls -l $i | awk -F\ ‘{ print $5 }’`
if (( $filesize > /tmp/tarlist.txt
fi
done

# Now that we have a list of files to tar up. Do it.

tlist=`cat /tmp/tarlist.txt`
echo $tlist
tar -cf $archname $tlist

# Clean up!

rm -f /tmp/tarlist.txt

******

——– how to handle file names with spaces ———-

NOT WORKING:

$ for arch in $(find /srv/www/ -iname ‘*’ -type f); do md5sum $arch; done
which works as long as there is no space in the file name.

WORKING:

$ find /srv/www -type f -exec md5sum {} \;
OR
$ find /srv/www -type f -print0|xargs -0 md5sum

******

reading … (#2)

One of the books I am reading now is Richard Dawkin’s The Ancestor’s Tale. I like this passage:

“Biological evolution has no privileged line of descent and no designated end. Evolution has reached many millions of interim ends <…>, and there is no reason other than vanity – human vanity <…> – to designate any one as more privileged or climactic than any other.”

Perhaps it is not just a coincidence that reading in parallel The Robot’s Rebellion – Finding Meaning in the age of Darwin (Keith Stanovich) I stumble across this:

“… the general public continues to believe in the discredited notion of evolutionary progress, this despite the fact that Stephen Jay Gould <…> has persistently tried to combat this error in his numerous and best-selling books. An important, but misguided, component of this view is the belief that humans are the inevitable pinnacle of evolution (“king of the hill … top of the heap” as the old song goes). Despite the efforts of Gould to correct this misconception, it persists. As Gould constantly reminds us, we are a contingent fact of history, and things could have ended up otherwise- that is, some other organism could have become the dominating influence on the planet”

reading … (#1)

A friend of mine has recently recommended me an interesting subject of study, which will also constitute the starting point of my online notes. While I could never go back and record my previous “n” years of reading into something equivalent to today’s blogs, notes of such being lost in the many hand-written notes filed away in my basement (as far as recording online capabilities is concerned), I will try to commit every new item of interest to me (for details on what I really care about – please read my blogroll area) to this easy-to-use bookmark/reference online place … so here it goes …

autopoiesis – fascinating subject (thanks, Calin!). As term, this was first set up by Chilean scientists Humberto Maturana and Francisco Varela:
“an autopoietic system is organized (defined as a unity) as a network of processes of production (transformation and destruction) of components that produces the components that:
1. through their interactions and transformations continuously regenerate and realize the network of processes (relations) that produced them;
and
2. constitute it (the machine) as a concrete unity in the space in which they [the components] exist by specifying the topological domain of its realization as such a network” [Varelia - 1979]

Dr. Randall Whitaker reveals something that I feel is one of the most important issues related to autopoiesis amazing breakthrough:

“What makes ‘autopoiesis’ distinctive as a definition for living systems?

If you go back and check most definitions (e.g., in a biology textbook), you are likely to find nothing more coherent than a list of features and functional attributes (e.g., ‘reproduction’, ‘metabolism’) which describe what living systems do, but not what they are. Because it is framed with respect to the constitution of a living system (as a specific class of systems), autopoiesis is a unique means for defining living systems in terms of their essential character (as opposed to their subsidiary features).”

******

Evan Thompson in: Life and mind: From autopoiesis to neurophenomenology. A tribute to Francisco Varel
has the followings to say:

Living is cognition.
This proposition comes from Maturana and Varela’s theory of autopoiesis (Maturana and Varela, 1980). Some have taken the “is” in this proposition as the “is” of identity (living = cognition) (Stewart 1992, 1996), others as the “is” of predication or class inclusion (all life is cognitive) (Bourgine and Stewart, in press; Bitbol and Luisi, forthcoming). The origins of the proposition go back to Maturana’s 1970 paper, “Biology of Cognition” (Maturana 1970). There he used the concept of cognition widely to mean the operation of any living system in the domain of interactions specified by its circular and self-referential organization. Cognition is effective conduct
in this domain of interactions, not the representation of an independent environment. In Maturana’s words: “Living systems are cognitive systems, and living as a process is a process of cognition. This statement is valid for all organisms, with and without a nervous system” (Maturana 1970 p. 13).
Francisco later came to prefer a different way of explicating the “living is cognition” proposition: “Living is sense-making.”
******
1. Life = autopoiesis. By this I mean the thesis that the three criteria of autopoiesis – (i) a boundary, containing (ii) a molecular reaction network, that (iii) produces and regenerates itself and the boundary – are necessary and sufficient for the organization of minimal life.
2. Autopoiesis entails emergence of a self. A physical autopoietic system, by virtue of its operational closure, gives rise to an individual or self in the form of a living body, an organism.
3. Emergence of a self entails emergence of a world. The emergence of a self is also by necessity the emergence of a correlative domain of interactions proper to that self, an Umwelt.
4. Emergence of self and world = sense-making. The organism’s world is the sense it makes of the environment. This world is a place of significance and valence, as a result of the global action of the organism.
5. Sense-making = cognition (perception/action). Sense-making is tantamount to cognition, in the minimal sense of viable sensorimotor conduct. Such conduct is oriented toward and subject to signification and valence. Signification and valence do not pre-exist “out there,” but are enacted or constituted by the living being. Living entails sense-making, which equals cognition.

******

useful analysis tools – usage reminder

How to obtain multiple files during a capture:

$ tethereal -i <interface> -a filesize:3000 -b 14 -s 96 -w <capture_file>
(3MB files of 96 bytes length)

NOTE: tcpdump defaults to 96 bytes length, also, but I am not sure if it supports ring buffer?!?

******

If multiple files matching the regexp FOOBAR are to be merged :

$ mergecap -w bigfile.cap `ls FOOBAR`

******

Determine the type and length of capture:

$ file <capture_file>
capture_file: tcpdump capture file (big-endian) – version 2.4 (Ethernet, capture length 96)

******

Analysis of capture file by reading the file in ntop, for statistics

# ntop -f <dump_file> -m <subnet_considered_local>

# ntop -c -m 10.10.10.0/24 -n -q -O <ntop-suspicious> -r 30 -u root

where: subnet considered local was 10.10.10.0

then connect to http://localhost:3000

******

Analysis of capture file through snort (create config file, first!), with the following command:

$ sudo snort -r <capture_file> -c <config_file> -X -d -A full

******

Determine the number of connections in a capture file:

./tcptrace -t -n <capture_file>

******

Stats w/tethereal:

# tethereal -i eth2 -z “io,stat,60,tcp&&tcp.port==21&&tcp.flags==0×02,\
COUNT(tcp.flags)tcp,flags&&tcp.port==21&&tcp.flags==0×02,\
AVG(tcp.flags)tcp.flags&&tcp.port==21&&tcp.flags==0×02,\
MIN(tcp.flags)tcp.flags&&tcp.port==21&&tcp.flags==0×02,\
MAX(tcp.flags)tcp.flags&&tcp.port==21&&tcp.flags==0×02″

******

Time:

# tcpdump -r <file> ==> time LOCAL to where the trace is being read
# tcpdump -r <file> -tt ==> UNIX epoch time
# date -r <result_of_above>
# tcpdump -r <file> -tttt ==> UTC time

So – we could potentially identify the location of the systems!

******

Example of usage – tcpflow accepts “expression” BPFs

#tcpflow -r <file_name> -c “(src port 21 and host 172.16.4.4)” or “(src port 20 and host 172.16.4.4)” > flowfile.txt

NOTE: -c above forces tcpflow to combine the traffic into one file (otherwise – if omitted – tcpflow creates two files: one from source, one from destination)

******

Determine OS:

# p0f -s <capture_file> -x “expression” (usually “host <IP_address>”)

NOTE: -x dumps the whole package content

******
Consolidate src-dst – see Honeynet challenge 23 – very, very useful!

$ tethereal -nr <capture_file> | ./sumsrcdst > file_with_conversations

******

TTL by IP conversation

$ tcpdump -vvvr <capture_file> |awk ‘{print $2, $5, $6, $15, $17}’ |sed ’s/,//;s/://;s/\./ /4′ |sed ’s/\./ /7′ |grep IP |awk ‘{print $3, $4, $6}’ |sort |uniq > ttl-by-ip-conv.txt

******

ngrep (-q) -> string searches inside network captures:

$ ngrep -I <capture_file> -q -x ‘passwd’ ‘tcp port 21′ –> reveals the attempts for passwd file retrieval or processing via ftp

also: $ ngrep -q -I <capture_file> passwd port 21

$ ngrep -I 2003.12.15.cap -q -x ’shadow’ ‘tcp port 21′ –> same with shadow file access attempts

******

Time-related splitting of files:

# tethereal -r <capture_file> -w <new_file> -R ‘(frame.time >= “Jan 8, 2004 22:00:00.00″) && (frame.time <= “Jan 8, 2004 23:00:00.00″)’

******

Validate distance between networks as previously obtained w/ntop:

$ sudo p0f -l -s <capture_file> |sed ’s/>//g’ |awk -F “-” ‘{print $1,$3}’ |grep distance |sed ’s/:/ /g’ |awk ‘{print $1″<–>”$3″==”$6}’ |sed ’s/,//’ |sort |uniq > distances.txt

******

TCP conversations, sorted and counted:

$ tcptrace -n -t <capture_file> |sed s/”:”/” “/g |awk ‘{print $2 $4 $5}’ |sort |uniq -c |sort -

******

Finding all hosts having been contacted by 10.10.10.195 on the SSH port:

$ tcpdump -r <capture_file> -X -s 1514 ‘host 10.10.10.195 and tcp port 22′ |grep ssh |awk ‘{print $3;}’ |awk -F. ‘{print $1″.”$2″.”$3″.”$4;}’ |sort |uniq

******

MAC address connections:

$ tcpdump -neqr <capture_file> |awk ‘{print $2″ “$3″ “$4;}’ |sed ’s/,//g’

******

MAC and IP connections in one line:

$ tcpdump -neqr <capture_file> |awk ‘{if ($5==”IPv4,”) print $2″ “$3″ “$4″ “$9″ “$10″ “$11;}’

******

Script to determine the MAC addresses associated with IPs:

$ tcpdump -nner <capture_file> | awk ‘{print $2 ” ” $11}’ | awk -F. ‘{if($1!~/x+|w+|r+/) print $1 “.” $2 “.” $3 “.” $4}’ |sort -u > mac-and-ip-address-pairs.txt

******

Creating individual capture files, based on some read-filters, prior to creation of output files:

$ tethereal -r <capture-file> -V -R <read-filter> -w <output-file>
$ tethereal -r <capture_file> -V -R ‘tcp.port==20 or tcp.port==21′ -w <ftp-sessions

******

Another way to reveal conversations:

$ ipsumdump -psSdD -r <capture_file> |sort +1n -n |uniq

******

MAC-to-vendor from oui.txt

$ awk ‘$1 ~ /^[0-9a-f][0-9a-f]\-[0-9a-f][0-9a-f]\-[0-9a-f][0-9a-f]/ {print $3}’ oui.txt
$ awk ‘$1 ~ /^[0-9a-f][0-9a-f]\-[0-9a-f][0-9a-f]\-[0-9a-f][0-9a-f]/ {print $1, $3}’ oui.txt |sed ’s/-/:/g’ > mac-to-vendor.txt

******

Replace first 3 bytes of MAC address from source-mac.txt with vendor name from above:

$ awk ‘FNR==NR{a[$1]=$2;next} {b=$0;k=substr($1,1,8);if(k in a)b=a[k]substr($0,9);print b}’ mac-to-vendor.txt source-mac.txt > mac-to-vendor-to-ip.txt

******

Analysis of flows

$ tcpick -r <capture_file> -n -C -yP “host 10.10.10.195″ |egrep -v ‘FIN|SYN|TIME|CLOSED’

******

========= congraph shell script ==========

# tethereal creates a list of packets
# cut pulls off the two addresses
# sed removes the arrow to protect it from later munges
# sort puts duplicates next to each other
# uniq removes adjacent duplicates

tethereal -r $ipf -N mnt | awk ‘$4==”->”{print $3,”###”,$5;}’ | sort |uniq > raw

# Create the connections list:
# sed munges the names of the nodes
# sed prefixes node names that start with a digit

sed ’s/[-\.:()]/_/g’ < raw | sed ’s/\(^[0-9_][0-9_]*\)/IP\1/g;s/ \([0-9_][0-9_]*\)/IP\1/g’ > cons

# Create the nodes list:
# sed puts all node names on seperate lines
# sort | uniq removes duplicates
# sed duplicates the names on the same line, with one inside a label attribute
# sed munges the names in an identical manner to the connection list munge above, but only in the first name
# including prefixing names that start with a digit

sed ’s/ ### /\n/’ < raw | sort | uniq | sed ’s/\(.\+\)/\1 [label=\"\1\"]/’ | sed ‘:loop;s/[-\.:()]\(.* \[lab\)/_\1/;t loop;s/\(^[0-9]\)/IP\1/g’ > labels

******

Format tcpdump output, via tcptrace and using xplot:

$ sudo tcpdump -s 100 -w <output_file.cap> host <hostname_or_IP>
$ tcptrace -Sl <output_file.cap>
$ xplot a2b_tsg.xpl

******
Other tools: netwox/netwag; pcapmerge; argus/ra/racount/rasort; sguil (w/snort); ACID or BASE (w/snort) …