Intro to systemtap

Posted by Ian S. Nelson Wed, 30 Aug 2006 23:13:00 GMT

What is systemtap?

Systemtap is a tool built upon the kprobes framework for probing, debugging and analyzing the linux kernel at runtime. IBM contributed kprobes to linux and had it included during the 2.5 development tree in 2002. It's a supported and standard portion of the Linux kernel. Systemtap builds upon kprobes and creates a fairly easy to use tool that let's you probe just about anything and everything in the linux kernel with fairly simple commands, you can almost effortlessly modify any variable you wish on a running kernel. All with minimal performance impact. Essentially it's the linux version of dtrace the differences being that systemtap is probably more flexible in kernel space and dtrace currently does a lot more for probing userspace applications.

So how do I get this working?

It's fairly easy and straight forward to install. There are RPMs for Fedora, Redhat Enterprise and Suse based linux distributions. There is a slight trick, at least with Fedora Core 5, you need to have the kernel debug RPM also installed. Basically, it's just a set of object code files with debug symbols. I couldn't find it in the latest Fedora Core 5 and upon closer look the kernel RPM didn't have the debug package enabled by default. I placed the following in my kernel-2.6.spec from Fedora Core 5 and was able to build the kernel debug symbol package.

%define _enable_debug_packages 1

With that added to your kernel spec file you can use rpmbuild and rebuild it and it will emit the debug package.

What can you do now?

Now you can start probing stuff. The systemtap site has some fairly simple demos, one is to print the top 20 most common syscalls every 5 seconds.

#!/usr/bin/env stap
#
# This script continuously lists the top 20 systemcalls on the system
#

global syscalls

function print_top () {
        cnt=0
        log ("SYSCALL\t\t\t\tCOUNT")
        foreach ([name] in syscalls-) {
                printf("%-20s\t\t%5d\n",name, syscalls[name])
                if (cnt++ == 20)
                        break
        }
        printf("--------------------------------------\n")
        delete syscalls
}

probe kernel.function("sys_*") {
        syscalls[probefunc()]++
}

# print top syscalls every 5 seconds
probe timer.ms(5000) {
        print_top ()
}

This little probe will sleep for 5 seconds counting each system call as it happens and then dumps the list out and starts counting again. Here is a sample output:

SYSCALL                         COUNT
sys_gettimeofday                  771
sys_futex                         637
sys_clock_gettime                 214
sys_rt_sigprocmask                141
sys_select                        136
sys_poll                          106
sys_newstat                        84
sys_ioctl                          62
sys_read                           48
sys_write                          47
sys_fcntl                          47
sys_rt_sigaction                   38
sys_time                           30
sys_getppid                        26
sys_setitimer                      16
sys_recvfrom                       13
sys_rt_sigreturn                   13
sys_nanosleep                      12
sys_close                          11
sys_open                           10
sys_newfstat                        6

Try altering the load on your system and seeing how that affects things. That's kind of interesting but not terribly useful. Try this version:

#! /usr/bin/env stap
#
# This script continuously lists the top 20 systemcalls on the system
#

global syscalls

function print_top () {
        cnt=0
        log ("SYSCALL\t\t\t\tCOUNT")
        foreach ([name] in syscalls-) {
                printf("%-20s\t\t%5d\n",name, syscalls[name])
                if (cnt++ == 20)
                        break
        }
        printf("--------------------------------------\n")
        delete syscalls
}

probe kernel.function("sys_*") {
        if (target() == tid())
                syscalls[probefunc()]++
}

# print top syscalls every 5 seconds
probe timer.ms(5000) {
        print_top ()


}

I added a line to the probe which checks to see if the target is a value that system let's you pass in. Now you can add a -x argument and the pid of a process and it will perform the same probes only against that particular process. I have a postgresql database running and ps tells me that pid 16679 is the writer thread.

...
postgres 16679  0.0  4.4 199916 91000 ?        S    Jul21   0:00 postgres: writer process
...

Let's look at it.

SYSCALL                         COUNT
sys_getppid                        25
sys_time                           25
sys_select                         25

over and over and over, since the database isn't doing anything this makes sense, it's just waiting for something to tell it what to do. You can attach strace to that process (strace -p 16679) and see that it is correct. Let's distburb the database and make it do some I/O and see how it behaves. I'll insert a record into a table.

SYSCALL                         COUNT
sys_getppid                        25
sys_time                           25
sys_select                         25
sys_lseek                           4
sys_write                           4

That makes some sense, one record insereted resulted in 4 seeks and 4 writes. Maybe that's a write to an index, a couple writes to b+tree nodes and then an actual record being written, or maybe it is writing to a journal file. I don't know, but it sounds kind of reasonable and I haven't looked at the postgresql code yet. In fact we haven't looked at any source code yet or really compiled anything. This is kind of a cheap example since strace and the ptrace API can provide all of this information (although there is considerable impact on system performance with them) so here is something a bit harder to do.

#! /usr/bin/env stap
global comps
global compsizes
probe kernel.function("memcmp")
{
                comps++
                compsizes = compsizes + $count
}

probe timer.ms(5000)
{
        printf("memcmp call count      bytes compared\n")
                    printf("%d                    %d\n", comps, compsizes)
        comps = 0
        compsizes = 0
}

This probe monitors the kernel's memcmp function which compares two buffers of memory. $count is an argument to that function which contains the number of bytes to compare.

memcmp call count      bytes compared
17151                    114214
memcmp call count      bytes compared
5455                    34990
memcmp call count      bytes compared
17165                    114269
memcmp call count      bytes compared
5398                    34817

As you can see memcmp is called a lot and on average to compare about 6 bytes worth of data. I wonder what it sort of oscillates? I'll have to look in to that.

Here is another simple one, __kmalloc is used to allocate memory in the kernel.

#! /usr/bin/env stap
global allocs
global allocsizes


probe kernel.function("__kmalloc")
{
        allocs ++
        allocsizes = allocsizes + $size
}


probe timer.ms(5000)
{
        printf("malloc count      bytes malloced   \n")
        printf("%d                     %d              \n", allocs, allocsizes)
        allocs = 0
        allocsizes = 0
}

which produces something looking like this:
malloc count      bytes malloced 
16                     30128              
malloc count      bytes malloced
36                     69312              
malloc count      bytes malloced    
76                     64469             
malloc count      bytes malloced    
28                     56651              
So every 5 seconds this particular system is doing about 40 mallocs and it's averaging to about 5K per malloc if my napkin math is correct.

That's just a taste, next week I'll do a more interesting example.

Tags , , ,  | 4 comments

Déjà vu with big time port scanning

Posted by Tate Hansen Wed, 23 Aug 2006 05:13:00 GMT

We have a potential pen. test  project coming up with crazy numbers:  260,000+ range of publicly addressable IPs (spread over several non-contiguous blocks).  Our previous experience with scanning large blocks aggressively via the internet has made us, well, super sensitive.  It’s freakin’ hard. 

260,000 x 65535 TCP ports x 2 (number of port query attempts) = 34,078,200,000 TCP SYN packets (nmap, default).

That is ~4x larger compared to our last "large" effort (~70,000 IPs).

No info yet on the bandwidth available per block, latency conditions, or other factors, but let’s run some numbers to see how this would work.

From recall, the Sun v40z’s were averaging ~3500pps (packets per second) outbound per server.  I didn’t use packets per second to estimate previous effort, I created a quick excel chart similar to below to get a ballpark number (updated using 260,000 for the total number of IP addresses to scan):

approx. number of IP addresses to scan
260,000
 
number of TCP ports to query per IP address
65,535
 
total # of TCP ports to query
17,039,100,000
 
     
time-out setting per port query 
1.25
seconds (allow 1.25 seconds for query response)
number of query attempts per port 
2
which implies we need 34,078,200,000 SYN requests
total seconds per port query
2.50
seconds
     
number of parallel port queries
2
(we set it to 100, but empirical evidence shows this to be ~2)
total # of hours to complete single IP port scan
22.76
 
     
total number of seconds to perform scan (if sequential)
21,298,875,000
 
total number of hours to perform scan (if sequential)
5,916,354
 
     
number of scans in parallel
1,600
(2 servers, each running 8 unique nmap processes with min_hostgroup set to 100)
number of hours to complete scan
3,698
 
number of days to complete scan
154
 

Now I realize pps is an important factor.  Remember these are still pretty fast servers: quad dual core opterons with 16 GB RAM. Doing a sanity check with our previous results of obtaining ~3500 pps per server goes as follows:  

total # of SYN requests
34,078,200,000
average outbound packets per second / per server
3500
number of dedicated servers
2
total number of seconds to complete scan
            4,868,314.29
number of hours to complete scan
                   1,352.31
number of days to complete scan
56

So the numbers don't match. The first chart says it'll take ~154 days, the second says ~56 days. In the first chart I set "number of scans in parallel" to 1600. I got that by having 8 unique nmap processes each running with min_hostgroup of 100 (800 hosts being scanned simultaneously per server; we have two dedicated scanners, so the total is 1600 hosts). The number that is probably way off is the "number of parallel port queries" in the first chart. Although we had set the value to 100 (meaning scan 100 ports simultaneously on each host) it often seemed like we were only getting 2 or so in parallel (observed from watching tcpdump output). That was probably after running into the memory issues I reported with nmap (which Fyodor subsequently reported to fix).

I just ran a few quick checks on a Sun Fire x2100 we have (dual core opteron 175 / 4 GB of RAM) using a newer nmap version (4.11). Firing off a bunch of nmap scans on one server with parameters similar to below resulted in a sustained 9000+ pps (~4.5 Mbits/s).

/usr/local/bin/nmap -vv -sS -P0 -p 1-65535 -n --min_hostgroup 100 --max_rtt_timeout 1250 --min_parallelism 100 <a_/24_block>

I didn't let the test run for very long to see if issues arise like before. If we could sustain 9000 pps per server and be allowed to push ~9Mbits/s, then the overall time is greatly reduced. It drops to ~22 days. I really doubt we'll be able to or allowed to push 9Mbits/s, but at least we now have ballpark figures to play with.

Who else has had this kind of crazy port scanning fun?

Related entry: maximizing nmap scans for accuracy

 

Tags , , , ,  | 3 comments

The VA and Bureaucracy Part 3

Posted by Cory Stoker Tue, 22 Aug 2006 23:42:00 GMT

In part 2 of my VA auditing experience I told you all about our "training" for the VA assessment. I am going to finish this out with my thoughts on the first site experience. If you missed it here is part 1. With all the things that had gone on with this project I was very interested in how the actual audit was going to go for each site. Before I could think long on it I was off to the wonderful state of Maine in February.

Two taxi's going down the street Now I live in Colorado and most people's preconception of Colorado in the winter is exactly what Maine was... Cold, snowy, and dark. For those of you that don't know, Denver Colorado has a very mild winter and snow barely stays a week on the ground. In the mountains is a different story but Denver is on the plains not the mountains.

So back in Virginia we were told that we needed to car pool with the other auditors and that each auditor was responsible for ensuring the whole team got to the site. This was interesting to say the least as the audit teams were thrown together maybe 2 days before we actually flew out. Each trip I went on had a team with different people. This fact was great for meeting new people but horrible for car pooling as the one person who had the car was expected to ferry us around! Now the issue that greeted me first was that I got to Portland, Maine at about 11:00 PM EST and had to get to Augusta which is about 1 1/2 hours away. Trying to get ahold of the guy with the car did not happen as it went to VM suprisingly enough. Suffice to say I had to take a taxi to Augusta which costs about 170 dollars, footed by the tax payers of course. For people that don't know Maine, Portland is in the south and Augusta, the capital, is in the lower center of the state so a taxi ride was costly.

The second issue was that none of the audit staff could get ahold of each other. In fact I didn't even get to the facility till later on Monday cause we all were staying at different hotels. Hotels, flights, and rental cars were chosen by the coordinators not the auditors so this was not negotiable. Anyhow we were scheduled to be at the facility for 4 days and leaving the 5th day so I was already thinking of how much fun I was going to have.

Onto Monday we go! After I get to the facility with my chauffeur. I finally find out how many computers we are testing. Lets see the audit team had 3 "windows testers" including me so that means we can get pretty good coverage in 4 days right? Well we had to test a grand total of 26 computers and all the mobile nursing stations for a grand total of 30. Remember the checklist, the one that takes about 20 minutes per computer max? 30 / 3 = 10 computers over 4 days. So doing some more math we can estimate about a 4 hour work day including lunch. Now this facility was pretty big. So big that I would have easily gotten lost without my VA companion. Off I went to verify the VA is secure with my clipboard! Suffice to say that my VA companion was pleased to only waste 4 hours running MBSA and Dumpsec.

At this point I am sure a few of you are thinking that it was easier for me to test this minuscule amount of computers and then just chill till it is time to leave but it wasn't. We were not allowed to have cell phones on in the building because of possible interference with medical equipment, we were not allowed to go onto the VA network with our laptops, which makes sense, and we were in the middle of nowhere. Luckily we got to go home on the 3rd day meaning that we had only spent 4 days total in snowy Maine.

A guy with his hand over his eyes A few thoughts on my whole VA auditing experience. First, I did actually like meeting the other auditors and the technical VA personnel. They were great and made the whole project actually move forward. I also got to go to places I would never have gone to if not on business. What a waste of money the whole endeavor was. As Bruce Schneier likes to always say, this definitely had the perception of being a proactive security measure but that is all it was, a perception. I think that there were some serious loopholes somewhere that allows this sort of thing to go on. Like I said earlier, if this kind of project happened elsewhere everyone would be fired, unless of course they are interested in the perception. We ended up doing 10 facilities before we just could not take it anymore. We were not alone in that feeling as I think every team I was on had people that were new who had replaced someone that went to the "training".

Tags , , , , , , , ,  | 1 comment | no trackbacks

Books on Reversing

Posted by Cory Stoker Tue, 22 Aug 2006 04:37:00 GMT

I hope all of you moved over to our new blog server without issue. The other blog software was causing us some issues so we decided to move to another setup. This one runs on Typo so it is more suited for us. Both Tate and I know Ruby somewhat so we should be able to keep this up and running.

I have been on a study path to try and reinforce my class I took at Blackhat on Reverse Engineering. Here is the list of books I have been using:

Book Title Author Book Cover
"Reversing Secrets of Reverse Engineering" Eldad Eilam Reversing Secrets of Reverse Engineering book cover
"Exploiting Software" Greg Hoglund and Gary McGraw Exploiting Software bok cover
"Hacker Disassembling Uncovered" Kris Kaspersky Hacker Disassembling Uncovered book cover
"Microsoft Windows Internals 4th Edition" Mark Russinovich and David Solomon Windows Internals book cover
"The Art of Assembly Language" Randall Hyde Art Of Assembly book cover
"Write Great Code Volume 1" Randall Hyde Write Great Code V1 book cover
"Write Great Code Volume 2" Randall Hyde Write Great Code V2 book cover

If you are interested in disassembly or reversing then I highly recommend these books. The main book I am using is the "Reversing, Secrets of Reverse Engineering" and then I am following up with the other books as needed. The one book that might be disheartening is "The Art of Assembly Language". This book first teaches you a special language called High Level Assembly (HLA) and then slowly drops you down to low level assembly for the X86 thereby making you learn two languages. This is why it is so big..... I believe the reason is that it is hard to actually do something in assembly without knowing most of assembly so the author uses HLA to bridge the gap. I thought it worked out fine but I wish I had known that I had to learn HLA first then Assembly. By the time I realized this I was too far to stop.

Tags , , , , , ,  | 2 comments | no trackbacks

Does security need to be designed in from the start?

Posted by Ian S. Nelson Mon, 21 Aug 2006 23:51:00 GMT

Does security need to be baked in to a product from the very start? Or can you add it after the fact? Does the development model affect this? I think this is a really interesting question. My instinctual answer is that to build a properly secured application or system of applications you have to plan for it from the start. That's the parochial answer and I'm not sure that's it is 100% correct. The last few years various iterative development models have become more popular and while I cannot point to any examples of great success coming from that I also can't point out any failures that could have been avoided with any other development model. The trend is to rapidly develop with little or no up-front design, adapt to changing requirments dynamically and rapidly fix problems as they arrise. Do these development models lead to less secure products by their very nature?

The logical follow-up is how do you iterate security in to a product after the fact, if that's a valid way to do it? Any thoughts or experiences?

Tags , ,  | 2 comments | no trackbacks

Sorry for duplicates / Built new server stack

Posted by Tate Hansen Mon, 21 Aug 2006 03:13:00 GMT

Sorry for the duplicate entries in your RSS reader. We migrated Sunday to a new server and built a new stack: Apache -> mod_proxy_balancer, mod_rewrite -> Mongrel, Ruby, Rails, Typo -> PostgreSQL. We used mod_rewrite to hopefully keep the RSS links and others working (old permalinks end up dropping you on the latest entry). We haven't added all the comments back yet.

If you want to update your settings, our new RSS link is:

http://blog.clearnetsec.com/xml/rss20/feed.xml

 

no comments

Cory's Blackhat Training Day 1

Posted by Cory Stoker Wed, 09 Aug 2006 01:58:00 GMT

Black Hat Conference 2006 Banner

At Black Hat I took the "Reverse Engineering on Windows: Application in Malicious Code Analysis" course. The class was about reverse engineering malicious executable programs on the Windows platform just like the anti virus guys do for big companies like Symantec and iDefense. This was a very fun class for me as my background is not in reverse engineering or it's associated technologies like:

A car shifter in Reverse

I have taken many courses before for security and network related subjects but never from Black Hat. This was even my first Black Hat Briefings so I was very excited to see how it would turn out. Many times before I was disappointed with the classes I took as they tended to have a very interesting syllabus but then the classes ended up shallow on the technical depth that I thought I would get. This very thought process had me skeptical on the Black Hat course as well but man was I wrong! I think it took about 7 minutes into the class before we were launching IDA Pro and digging into the configuration of the tools we would use for the next 2 days. We didn't even introduce ourselves leaving us to guess who was even teaching us.

My teachers, Pedram Amini and Ero Carrera, were some very bright and intelligent guys that had been in the malware/virus reverse engineering circles for years. They were so adept at reversing things that they have written many tools to help with reversing such as:

Binary digits on a computer screen As you can see they were both Python fans. Too bad Ruby rocks Python so much that Python needs to go bash the Perl guys to feel better.

So by the first half day I was already disassembling the Mydoom.A virus looking for what it was made of, what it did, how it worked etc... Now this was not a "live" analysis but rather it was loading the binary executable up in a disassembler/debugger called IDA Pro and dissecting the binary code looking for things of interest. Basically we learned two main methods of doing analysis, "top down" and "bottom-up".

Top down was where you start at the program's main function and start labeling all the functions that are called to see what they do. For example you eyeball a function and decide it is gathering the time from the system so you label the function "Gets the time" and so on. Once you do this for all defined functions you can then concentrate on the ones that perform actions of interest like opening sockets or creating processes.

The bottom up approach was where we look for interesting code snippets, items in the Import Address Table (IAT) and strings table. For example if you find a call to "htons" and above it you see the number 80 (in hex of course) being placed into a register, you can deduce that it is making a call out to port 80 on the network.

Yes I know this sounds hard and it was..... But anybody could possibly learn this skill with practice. I will try to write up some good snippets from my class if anybody is interested. Here are some interesting sites to peruse if you are intersted in reversing:

Tags , , , ,  | no comments

security tools to checkout (new to us; from blackhat/defcon)

Posted by Tate Hansen Tue, 08 Aug 2006 04:24:00 GMT

I'm going through my notes from the conferences and here is my beginning list of tools to checkout (some new):

  • MatriXay:  web app and db pen-test tool (in beta, not free, not sure of cost, but it looked interesting)
  • PDB:  protocol debugger (free)
  • tattoo: traffic analysis toolkit (free)
  • Suru:  mitm proxy and web app fuzzer (not free, but cheap - $200)
  • Crowbar:  web brute force tool (free)
  • SP_LR (can't find a link yet): proxy framework targeted for malware analysis and fuzzing    

no comments

The Exploit Laboratory class at BlackHat Training was great.

Posted by Tate Hansen Mon, 07 Aug 2006 06:43:00 GMT

If you want to bump up your exploit writing skills – Saumil Udayan Shah is an excellent teacher.  His style of teaching brought out memories of my time as an ECE student at CU, Boulder.  He presented very clearly, kept the pace moving, and quipped often.  Great class.

The majority of time is spent on using GDB and WinDBG to inspect Intel 32-bit x86 CPU registers for opportunities.  The end game was always accompanied by netcat and metasploit (along with a decent amount of scripting to facilitate quick retries when trying to line up all the exploit code to ensure success). 

Here is the full class description:  http://www.blackhat.com/html/bh-usa-06/train-bh-us-06-ss-el.html

Tags , , ,  | no comments

BlackHat/Defcon quickies

Posted by Tate Hansen Sun, 06 Aug 2006 03:34:00 GMT

I don’t want to repeat what everyone else is writing about regarding attending BlackHat and Defcon, but several were freakin’ cool:

  • Joanna Rutkowka’s Blue Pill stuff. Totally own x64 Vista on AMD (Pacifica) using the new AMD processors virtual machine technology. Undetectable. “Writing signatures to detect things is rookie” -- an awesome quote by Joanna.

  • johnny cache and David Maynor’s layer 2 exploit. Get remote shell root access to a Mac, Windows, or whatever if the wireless card is simply ON (no need to associate or anything). Damn I would love to have this exploit on hand.

  • HD Moore’s talks:
    • Thermoptic Camoflauge: IDS and IPSes suck for lots of reasons. Signature based IDS and IPS systems really suck. Joanna’s quote from above kind of says it all, “rookie”. With the new metasploit, you’ll be able to evade anything and everything on the market.

    • Six Degree of XSSploitation: Cross site scripting is freakin’ dangerous. Douse with lots of browser vulns, and well, it’s getting ridiculous to have fun on the Internet. Nothing is safe, so unplug.

    • Metasploit Reloaded. The metasploit story is just getting better – it is the best framework to build exploits. The 3.0 version is being completely rewritten in Ruby so that is good for us.

  • Jeremiah Grossman’s Hacking Intranet Websites from the Outside. I haven’t seen this before – using JavaScript to serendipitously enumerate internal IP addresses, perform port scans, retrieve portions of the user’s browser history via checking CSS values, and even login and modify the DMZ rules in home DSL routers to allow external connections to a particular ‘live’ internal device. All done without exploiting anything – just using plain valid JavaScript.

no comments