uprintf("There is no place like ::1\n");

我眼中的Apple Watch

Apple Watch

近日Apple Watch的发布吸引了很多人的眼球,网络上很多人在讨论它的外形、功能、市场定位,也有一部分人像我一样对苹果产品家族中这个新成员的“意义”很有兴趣。

在这篇名为APPLE WATCH: ASKING WHY AND SAYING NO的文章中,作者通过观察苹果往年的发布会得出了这样一个很有意思的结论——苹果每次发布新产品的时候都会向听众先解释为什么苹果要研发这个产品——除了Apple Watch。

  • iPod于2001年10月23日发布,介绍iPod的部分自11分30秒开始,但iPod本身于20分48秒才出现在大屏幕上。这10分钟乔布斯用于解释音乐市场,为什么苹果可以在音乐市场获得成功及iPod的特别之处。

  • iPhone于2007年1月9日发布,iPhone直到7分03秒才出现在屏幕上,紧接着介绍了多点触控,直到12分20秒话题才重新回到iPhone。这中间乔布斯介绍了智能手机市场,为什么苹果可以成功以及iPhone的特别之处。

  • iPad于2010年1月27日发布,iPad部分开始于5分15秒,但iPad本身直到8分55秒才出现在大屏幕上。中间这段时间乔布斯用于解释iPhone和Mac电脑之间市场的空隙,以及这个市场的空隙需要一个在某些使用情景中比iPhone和Mac都好用的设备来填补。

  • 而首次引入Apple Watch的发布会上,Cook并没有花多少口舌在苹果手表的实际意义,仅仅一句玩笑就引出了产品的介绍视频。

这不禁让人更加好奇于苹果此举的动机。市场受众是哪些人?为什么苹果认为自己可以成功?是什么让Apple Watch与众不同?

以往的经验告诉我们,苹果并不会仅仅出于跟风的目的,为了赶“可穿戴设备”的时髦而推出Apple Watch。即使我本人并不喜欢Cook,但他也不至于如此不堪。

此表非彼表

仔细看看所有关于Apple Watch的demo,大部分一点也不激动人心,对吧?除了导航时的触觉提醒有点“科技改变生活”的味道以外,再怎么看它都只是一块功能多一点的电子表而已嘛。

但我认为并非如此。如果我们反过来把它看作是一台恰巧缠在你手腕上的可穿戴微型电脑,而不仅仅只是一块表的话,它所具备的意义就大不一样了。

它是苹果手机的扩展、配件,也是一个全新的独立的可穿戴产品线的首次尝试。iPhone首次登场的时候并没有现在这么多人看好,也没有多少人可以预见到几年后的今天,人们的生活就这样被一台小设备整出了翻天覆地的变化。仔细想想,你手中的苹果手机,它现在仅仅只是一部手机吗?几年后的苹果手表,还会仅仅是一块手表吗?

即便如此,从传统数码设备到可穿戴数码设备是一个巨大的飞跃,这并不仅仅需要科技公司在技术上的突破,更需要用户一步一步的适应。在整个市场及苹果多年经营的生态圈对可穿戴设备作出反应之前,我们应该还会看到一段时间的寂静。

数码产品早已“可穿戴”

提起可穿戴设备,比较关注科技的“呆子”们可能首先想到的是谷歌眼镜。但实际上人们很早就已经开始“穿戴”数码设备了。而这种“穿戴”数码设备的现象,则更多归功于苹果手机的热卖。

不知不觉,关心新手机上市的不再是那些电脑呆子,科技狂人或者硬件发烧友了,而是你身边帅哥美女时尚达人们。

-“哎呀iPhone6上市了啊,好纠结呀,你说是买6呢,还是买六普拉斯呢?”

-“买普拉斯啊,屏幕大一点,打电话的时候显得脸小。”

当手机屏幕变大不再是发烧友们坐在一起津津乐道的话题,而变成的“显得脸小”的工具时,你应该已经意识到手机早已经悄悄成为人们穿戴的一部分了。

人们从关心苹果产品的质量性能服务,转移到了关注它所带来的社会和时尚作用,如此不如干脆做些时尚的手表给你们,这次真的可以戴在身上了。

即使是一块普通的电子表,一旦打上了苹果的标签,都会毫无疑问是这部分客户的新宠。“功能定位?市场前景?产品特性?” 管他呢。

屌丝拼实力,富帅讲情调

说起表来,就很容易联想到瑞士表。

石英表比机械表更廉价,走时更精准,但机械表依旧占据着高端奢侈品市场。屌丝们从地摊上买个10块钱的电子表大可以用“我的表比你的准”秒杀戴着几十万腕表的富帅们。

数码照相机同样廉价好用,容量大还不用换胶卷,但还有很大一部分“高端玩家”摆弄几十万的徕卡,没事折腾折腾Lomo什么的,生活幸福感倍高。这叫做情调,你们凡人是不会懂的。

我们看到了天价的衣服,包包,鞋子,腕表,这些产品的价值早已不局限于它们的功能和质量,而在于品牌以及品牌所带来的社会效应。彰显品味和社会地位的诉求需要奢侈品来满足。

苹果这款“天价”的手表,我认为是“电子奢侈产品”的一次尝试。如果衣服鞋帽可以抛开性能质量来谈品牌,电子产品何尝不可。实际上,苹果产品相比市场同类产品价格要高的现象早已把“苹果是电子产品奢侈品牌”这个想法埋在了消费者的心里,这次的Edition Model一定会赢得各路土豪的青睐。

作为屌丝,我们可以买300多刀的Apple Watch来彰显一下咱们“科技,年轻,活力”的时尚品味。同时我们可能会因为目前功能不够“劲爆”而有点小遗憾。

作为富帅,12万的Edition model戴出来直接呼你一脸的“老子就是有钱”。也许土豪们会恨不得在18K金的基础上再镶上满满的钻。只是可怜了中产阶级,下次给女友情人小三送苹果产品的时候可不是一个肾就能搞定的了。

题外话,小米步步高什么的完全可以出手表跟苹果抢市场嘛,只不过情调这个东西只有富帅们玩的起。“步步高24k黄金限量版智能手表可打电话发短信作手电打火机”这画面就有点太过于凄美了。奢侈不是你想搞,想搞就能搞。即使是三星也不行,你可以出智能手表,智能眼镜,智能外套,智能拖鞋,智能内裤,你就是不敢卖12万。

屌丝拼技术,富帅讲情调,这样的手笔目前也只有苹果做的出来了。未来怎样,拭目以待。

<< EOF

FreeBSD Kernel Rootkit Design Howtos - 5 - Character Device First cdev Module

Original Posted @ 19/06/2012

Sup guys, welcome back to today's FreeBSD Kernel Rootkit Design Howtos! I know it's been a couple of days since I posted the previous note, I was dealing with my visa issue 'cause I accidentally overstayed (I will never ever ever do that again, I promise!)

But anyway, I'm back, so let's jump right into our topic for today.

The Review

We have discussed two types of kernel modules in the past 4 sessions: General Kernel Module, and System Call Kernel Module.

Hope you still remember that it took us 3 sessions to discuss the System Call Kernel Module and a tiny little client application, and lastly we discussed a little bit on Kernel/User Space Transitions.

So I guess that'd be enough for the System Call Kernel Module since we already know how to declare it, how to call it, and how to do it in a safe way.

And today we are gonna talk about a new type of kernel module -- The Character Device Module, so get ready and hope you enjoy!

You can always check back for previous notes from the ToC in here FreeBSD Kernel Rootkit Design Howtos - Introduction

The Device

Before we discuss what is a Character Device, let's talk about what is a Device first.

A device is simply components connected with your motherboard, such as your hard disk, USB Drive, Handphone, or even Keyboard and Mouse.

Maybe you are not an unix-like system expert, but as one of the most famous "characteristics" of unix-like systems, you may have heard that In UNIX Everything is a File (This is a great article, check this out).

That's true, and by Everything we mean documents, directories, links, USB devices, network connections, terminals, inter-process communication, and devices.

All of these are represented as files in the unified local filesystem because they are treated as stream bytes, so you can read, write, lseek, and close them. Imagine that getting user input from the keyboard is simply read from the keyboard device file, very nice and clean right?

The benefit of this design is of course a simplified and unified API for all devices, and this was very advanced design back in the day when other operating systems required user to use different commands to copy files from different types of devices.

The above figure shows a list of files in /dev, it's a specific implementation of filesystem called devfs in which all device files are stored and managed.

The Block Device and The Character Device

Now we know that we can interact with a device by accessing its device file, but what actually controls the behavior of a device?

The answer is in the kernel. Since firstly kernel should be the one dealing with hardware, and secondly the segregation between kernel modules can prevent a single hardware error from crashing the whole system.

That means if you are a device driver developer and you are about to
work on a new hardware driver for most Unix-like systems, you'll have to
program a Device Kernel Module, and the module should follow the
standard and create a file in devfs for users to interact with
your driver.

So that the concept of Device Kernel Module is similar to the concept of driver, except that in fact there are many different kinds of device kernel modules designed for different purposes. Here I'm gonna talk about two main types of modern unix-like system device kernel modules, the Block Device Module and the Character Device Module.

  • Character Devices
    • A character device driver is one that transfers data directly to and from a user process (FreeBSD Handbook: Character Device)
    • Performs no buffering
    • Read/Write 0 or more bytes in a stream
  • Block Devices
    • Block devices are disk devices for which the kernel provides caching (FreeBSD Handbook: Block Devices)
    • Performs buffering and access randomly through a cache
    • Read/Write fixed size blocks

Block devices are used to mount disk partitions in some other unix-like systems, and were used by FreeBSD as caching devices. The good news is, as you can see from FreeBSD Handbook: Block Devices, that the block device is deprecated by FreeBSD as a step to modernize the system design.

The caching part was moved upward and is now known as the vnode, so we totally don't have to worry about block device here, but it's still good idea to understand the differences between these two types of devices.

Alright guys, take a deep breath and we are just about to officially start our topic today -- the Character Device Module (...wut?)

To conclude the above boring introduction, background, and design scenario, the Character Device Module, as the most common type of device driver (FreeBSD Handbook: Character Device), is simply just another type of kernel module that installs itself in the kernel and provide interface (devfs) represented as file to interact with users.

To successfully declare a character device module, there are three unique things we need to deal with first, we need to have a cdevsw structure, character device functions, and device registration routine. And of course, we’ll have to have an event handler and module declaration macro just like any other kernel modules.

The cdevsw Structure

Like what we have in a system call module, the cdevsw structure is a structure in which we store related information about the character device module, and this structure will be stored in a cdevsw structure table known as the character device switch table.

It totally looks similar to what we did in system call module session, we have a sysent structure to store some related information, and then we put that structure into the sysent[] table for further reference. So there should be no problem to understand this concept and the design rationale.

The cdevsw structure is defined in sys/conf.h as the following. I added some comments for quick reference.

FILE:/usr/src/sys/sys/conf.h
/*
 * Character device switch table
 */
struct cdevsw {
        int                     d_version;
        u_int                   d_flags;    /* D_TAPE, D_DISK, D_TTY, D_MEM */
        const char              *d_name;    /* Device name in /dev */
        d_open_t                *d_open;    /* Func. pointer to dev open function */
        d_fdopen_t              *d_fdopen;
        d_close_t               *d_close;   /* Func. pointer to dev close function */
        d_read_t                *d_read;    /* Func. pointer to dev read function */
        d_write_t               *d_write;   /* Func. pointer to dev write function */
        d_ioctl_t               *d_ioctl;   /* Func. pointer to dev ioctl (an operation other than a read or a write) function */
        d_poll_t                *d_poll;    /* Polls a device to see if there is data to be read or space available for writing */
        d_mmap_t                *d_mmap;
        d_strategy_t            *d_strategy;
        dumper_t                *d_dump;
        d_kqfilter_t            *d_kqfilter;
        d_purge_t               *d_purge;
        d_mmap_single_t         *d_mmap_single;

        int32_t                 d_spare0[3];
        void                    *d_spare1[3];

        /* These fields should not be messed with by drivers */
        LIST_HEAD(, cdev)       d_devs;
        int                     d_spare2;
        union {
                struct cdevsw           *gianttrick;
                SLIST_ENTRY(cdevsw)     postfree_list;
        } __d_giant;
};

This may look very complicated, but actually there are just two parameters required to declare a character device, the *d_name and the d_version.

The *d_name is obvious the name of the character device module, and will be used to name the filename in /dev.

And the d_version specifies which FreeBSD version this character device module supports, it is defined in sys/conf.h as well.

FILE:/usr/src/sys/sys/conf.h
/*
 * Version numbers.
 */
#define D_VERSION_00    0x20011966
#define D_VERSION_01    0x17032005      /* Add d_uid,gid,mode & kind */
#define D_VERSION_02    0x28042009      /* Add d_mmap_single */
#define D_VERSION_03    0x17122009      /* d_mmap takes memattr,vm_ooffset_t */
#define D_VERSION       D_VERSION_03

The D_VERSION is the same as the latest d_version, so we're gonna use this unless you want to have your character device module specifically run on a particular legacy version of FreeBSD.

So if you decide to be a superb lazy ass, the simplest definition of cdevsw structure can be

static struct cdevsw cd_example_cdevsw = {
    .d_version = D_VERSION,
    .d_name ="cd_example"
};

Unfortunately this character device module pretty much does nothing, despite that we are not required to define all of the functions such as *d_open or *d_close, we still need to have some of them to let our module at least do something.

The Character Device Functions

There are a lot of functions we can choose to implement in order to feature our module, but as I said, it does not require us to implement all of these listed functions in the definition of cdevsw structure.

Oh, FYI, for those functions that are not specified in the declaration of your module's cdevsw structure, the operation will be considered as not supported.

As to demonstrate a general character device module, I'm just gonna show you four most basic functions, *d_open, *d_close, *d_write, and *d_read. Now, we can extend our lazy-ass example as following,

static struct cdevsw cd_example_cdevsw = {
    .d_version = D_VERSION,
    .d_name ="cd_example",
    .d_open = open,
    .d_close = close,
    .d_read = read,
    .d_write = write
};

In the example above, we explicitly said that we are gonna define a character device module named cd_example_cdevsw, and its gonna have four character device functions with their function pointers specified.

Take note that once we define a cdevsw structure with particular functions, the compiler will expect that function to be implemented, and will result in a compiling error if we fail to do this.

The prototypes of these functions are defined in the same file, sys/conf.h. Access the file for a complete list of all prototypes, I'm just gonna list down the four of them,

FILE:/usr/src/sys/sys/conf.h
/*
 * Character Device Function Prototypes.
 */
typedef int d_open_t(struct cdev *dev, int oflags, int devtype, struct thread *td);
typedef int d_close_t(struct cdev *dev, int fflag, int devtype, struct thread *td);
typedef int d_read_t(struct cdev *dev, struct uio *uio, int ioflag);
typedef int d_write_t(struct cdev *dev, struct uio *uio, int ioflag);

To quickly give you an overall understanding of how this works, we are gonna firstly create a read-only character device module. And by read-only, it means that we can only read stream bytes from the module which runs in the kernel back to the user space. Let's see an upgraded version of our not-as-lazy-as-previous-lazy-ass-example,

// Function prototype
d_read_t read; //d_read_t is typedef'd in sys/conf.h

// Define a read-only device
static struct cdevsw ro_cdevsw = {
    .d_version = D_VERSION,
    .d_read = read,
    .d_name ="cd_example_ro"
};

static char buf[512+1]; //A string in kernel space
static size_t len;

int read(struct cdev *dev, struct uio *uio, int ioflag)
{
    int error = 0;
    if (len <= 0)
        error = -1;
    else
        /* Return the saved character string to userland. */
        copystr(&buf, uio->uio_iov->iov_base, 513, &len);

    return(error);
}

In order to implement the read function, we firstly defined it with the d_read_t prototype. Then we defined a cdevsw structure called ro_cdevsw with only one function pointer passed, the read function.

Now since we are gonna read a string from the kernel back to the user space, we need to at least have a string in the kernel, right? So we defined a string called buf[512+1].

Finally we reached the implementation of the read function, everything is usual except that we used copystr here.

I hope you still remember that there are four copy functions, copyin(), copyinstr(), copyout(), and copystr(). All these four functions copy a continues data from kernel space to user space or vise versa except copystr. copystr function copies data from one kernel space address to another kernel space address. It is defined in sys/systm.h with following function prototype.

FILE: /usr/src/sys/sys/systm.h
int     copystr(const void * __restrict kfaddr, void * __restrict kdaddr,
            size_t len, size_t * __restrict lencopied)
            __nonnull(1) __nonnull(2);

Here's what the man page says about the copystr

The copystr() function copies a NUL-terminated string, at most len bytes

long, from kernel-space address kfaddr to kernel-space address kdaddr.

The number of bytes actually copied, including the terminating NUL, is

returned in *done (if done is non-NULL).

So what we did with copystr(&buf, uio->uio_iov->iov_base, 513, &len); was to copy data from buf (which is a string in kernel space) to uio->uio_iov->iov_base which as a matter of fact, is also a kernel address, with maximum length 513, and save copied length to len.

It's funny that we used copystr() function, 'cause it copies data from kernel space address to another kernel space address, but we were trying to copy the buf[] string from kernel back to user space. This is obviously nothing like a read operation that we expected.

To answer this question, we have to take a look into the destination kernel space address uio->uio_iov->iov_base that came out of nowhere.

In fact, the *uio structure does not came out of nowhere, it is actually defined in the read function prototype, and it was passed to our read function like this: int read(struct cdev *dev, struct uio *uio, int ioflag){}. That means every read action performed to a character device module from the user space needs to pass the *uio structure.

So now the question is, what is *uio, and why *uio. Here's the definition of it in sys/uio.h.

FILE:/usr/src/sys/sys/uio.h
struct uio {
        struct  iovec *uio_iov;         /* scatter/gather list */
        int     uio_iovcnt;             /* length of scatter/gather list */
        off_t   uio_offset;             /* offset in target object */
        ssize_t uio_resid;              /* remaining bytes to process */
        enum    uio_seg uio_segflg;     /* address space */
        enum    uio_rw uio_rw;          /* operation */
        struct  thread *uio_td;         /* owner */
};

I'll leave the why question to UIO(9) manpage, and it says

As a result of any read(2), write(2), readv(2), or writev(2) system call that is being passed to a character-device driver, the appropriate driver d_read or d_write entry will be called with a pointer to a struct uio being passed.

The transfer request is encoded in this structure. The driver itself should use uiomove() or uiomove_nofault() to get at the data in this structure.

That means we've got to have uio passed to read() or write() function and then use uiomove() or uiomove_nofault() to

The functions uiomove() and uiomove_nofault() are used to transfer data between buffers and I/O vectors that might possibly cross the user/kernel space boundary.

It seems that in order to be able to possibly cross the user/kernel space boundary in the kingdom of character device, the first thing we need to do is to move our buffer data to I/O vectors, namely the iovec *uio_iov array here. The I/O vector iovec is defined in sys/_iovec.h

FILE:/usr/src/sys/sys/_iovec.h
struct iovec {
        void    *iov_base;      /* Base address. */
        size_t   iov_len;       /* Length. */
};

Take note that the iov_base is the base address of the I/O vector, so by copystr(&buf, uio->uio_iov->iov_base, 513, &len); what we are really doing here is that we copied maximum 513 length of stream bytes from the starting address of the buffer string &buf to the base address of I/O vector uio->uio_iov->iov_base.

We'll get to the uiomove() function later, as for now, let's discuss about the very last new thing we have to know to declare a character device module.

The Device Registration Routine

The registration of a device module is accomplished by calling the declaration module called DEV_MODULE. Before that, we need to have an event handler which defines the actions to be performed when the device module is loaded or unloaded.

This is simple, we call make_dev() function to register our character device module with kernel and create the device file in devfs when the module is loaded, and we call destroy_dev() function to unregister the module and remove the device file in /dev when the module is unloaded.

1. make_dev() function

Let's take a look at the makedev() function first. It is defined in sys/conf.h as shown below,

FILE:/usr/src/sys/sys/conf.h
struct cdev *make_dev(struct cdevsw *_devsw, int _unit, uid_t _uid, gid_t _gid,
                int _perms, const char *_fmt, ...) __printflike(6, 7);
  • cdevsw *_devsw: pointer to cdevsw structure of the device module which was defined previously
  • int _unit: Normally set to 0
  • uid_t _uid: The owner ID of the device file
  • gid_t _gid: The owner group ID of the device file
  • int _perms: The permissions of the device file, e.g. 0600
  • char *_fmt: The name of the device

This function is straight forward, we pass the cdevsw structure to the make_dev() function to finally get it registered into the character device switch table. What's new is the return value of the function, which is a cdev structure. Well we've already discussed about the cdevsw structure, so what's a cdev structure?

Since it's a return value of make_dev() function, and the function registers our module with devfs file system, so supposedly it should hold information required by the devfs.

2. cdev structure

That's true, let's take a look at the definition of the cdev structure which is defined in sys/conf.h as shown below,

struct cdev {
        void            *__si_reserved;
        u_int           si_flags;
#define SI_ETERNAL      0x0001  /* never destroyed */
#define SI_ALIAS        0x0002  /* carrier of alias name */
#define SI_NAMED        0x0004  /* make_dev{_alias} has been called */
#define SI_CHEAPCLONE   0x0008  /* can be removed_dev'ed when vnode reclaims */
#define SI_CHILD        0x0010  /* child of another struct cdev **/
#define SI_DEVOPEN      0x0020  /* opened by device */
#define SI_CONSOPEN     0x0040  /* opened by console */
#define SI_DUMPDEV      0x0080  /* is kernel dumpdev */
#define SI_CANDELETE    0x0100  /* can do BIO_DELETE */
#define SI_CLONELIST    0x0200  /* on a clone list */
        struct timespec si_atime;
        struct timespec si_ctime;
        struct timespec si_mtime;
        uid_t           si_uid;
        gid_t           si_gid;
        mode_t          si_mode;
        struct ucred    *si_cred;       /* cached clone-time credential */
        int             si_drv0;
        int             si_refcount;
        LIST_ENTRY(cdev)        si_list;
        LIST_ENTRY(cdev)        si_clone;
        LIST_HEAD(, cdev)       si_children;
        LIST_ENTRY(cdev)        si_siblings;
        struct cdev *si_parent;
        char            *si_name;
        void            *si_drv1, *si_drv2;
        struct cdevsw   *si_devsw;
        int             si_iosize_max;  /* maximum I/O size (for physio &al) */
        u_long          si_usecount;
        u_long          si_threadcount;
        union {
                struct snapdata *__sid_snapdata;
        } __si_u;
        char            __si_namebuf[SPECNAMELEN + 1];
};

Oh my sweet god, that looks scary, do we have to set all these parameters? The simple answer is, no. We don't have to explicitly define anything here since it's a return value of make_dev() function, and it'll do the job for us.

3. destroy_dev() function

Everything is just automagical, oh, except one thing -- we have to explicitly define a cdev structure to receive the return value of the make_dev() function, since we'll need to use it for our destroy_dev() function. The destroy_dev() function is defined in sys/conf.h as shown below,

FILE:/usr/src/sys/sys/conf.h
void    destroy_dev(struct cdev *_dev);

This is superb simple, the destroy_dev() function will unregister the module and remove the device file in /dev, so it needs the cdev structure we just got from make_dev() function to get the module's devfs information in order to destroy the device successfully.

Thus, we can update our example as the following,

// Function prototype
d_read_t read; //d_read_t is typedef'd in sys/conf.h

// Define a read-only device
static struct cdevsw ro_cdevsw = {
    .d_version = D_VERSION,
    .d_read = read,
    .d_name ="cd_example_ro"
};

int read(struct cdev *dev, struct uio *uio, int ioflag)
{
    uprintf("Read function called!n");
}

/* Event Handler Code Here... */
...
static struct cdev *sdev;
sdev = make_dev(&ro_cdevsw, 0, UID_ROOT, GID_WHEEL, 0600, "cd_example_ro");
destroy_dev(sdev);
...

/* Declaration Macro Here... */
DEV_MODULE(...)

In the above example, we set the owner of the device to root, owner group to wheel, and permissions to 0600. A complete list of USER_IDs and GROUP_IDs are defined in sys/conf.h

FILE:/usr/src/sys/sys/conf.h
#define         UID_ROOT        0
#define         UID_BIN         3
#define         UID_UUCP        66
#define         UID_NOBODY      65534

#define         GID_WHEEL       0
#define         GID_KMEM        2
#define         GID_TTY         4
#define         GID_OPERATOR    5
#define         GID_BIN         7
#define         GID_GAMES       13
#define         GID_DIALER      68
#define         GID_NOBODY      65534

4. DEV_MODULE macro

Here comes the last part of the device registration routine, the declaration macro. The definition of DEV_MODULE macro is in sys/conf.h as shown below,

FILE:/usr/src/sys/sys/conf.h
#define DEV_MODULE(name, evh, arg)                                      \
static moduledata_t name##_mod = {                                      \
    #name,                                                              \
    evh,                                                                \
    arg                                                                 \
};                                                                      \
DECLARE_MODULE(name, name##_mod, SI_SUB_DRIVERS, SI_ORDER_MIDDLE)

Notice that when we use DEV_MODULE, we will eventually call the general DECLARE_MODULE with default module type and SI_SUB_DRIVERS, and the module position will be somewhere in the middle.
So all we need to do is to specify the general name, the event handler function pointer, and additional argument if you have any, really simple.

5. Connecting the dots

Yes, yes, I know this session is awfully long, but fortunately we are done now...or just a little bit more left.

Now let's connect all the pieces of code together to see how a character device register routine looks like.

// Function prototype
d_read_t read; //d_read_t is typedef'd in sys/conf.h

// Define a read-only device
static struct cdevsw ro_cdevsw = {
    .d_version = D_VERSION,
    .d_read = read,
    .d_name ="cd_example_ro"
};

static char buf[512+1]; //A string in kernel space
static size_t len;

int read(struct cdev *dev, struct uio *uio, int ioflag)
{
    int error = 0;
    if (len <= 0)
        error = -1;
    else
    /* Return the saved character string to userland. */
        copystr(&buf, uio->uio_iov->iov_base, 513, &len);

    return(error);
}

/* Reference to the device in DEVFS */
static struct cdev *sdev;

static int load(struct module *module, int cmd, void *arg)
{
    int error = 0;

    switch (cmd)
    {
        case MOD_LOAD:
        sdev = make_dev(&ro_cdevsw, 0, UID_ROOT, GID_WHEEL, 0600, "cd_example_ro");
        uprintf("Character device loaded.n"); break;

        case MOD_UNLOAD: destroy_dev(sdev);
        uprintf("Character device unloaded.n"); break;

        default: error = EOPNOTSUPP;
        break;
    }

    return(error);
}

DEV_MODULE(cd_example_ro, load, NULL);









#include <sys/param.h>
#include <sys/proc.h>
#include <sys/module.h>
#include <sys/kernel.h>
#include <sys/systm.h>
#include <sys/conf.h>
#include <sys/uio.h>

// Function prototypes
d_open_t open;
d_close_t close;
d_read_t read;
d_write_t write;

static struct cdevsw cd_example_cdevsw = {
    .d_version = D_VERSION,
    .d_open = open,
    .d_close = close,
    .d_read = read,
    .d_write = write,
    .d_name ="cd_example"
};

static char buf[512+1]; //A string in kernel space
static size_t len;

int open(struct cdev *dev, int flag, int otyp, struct thread *td)
{
/* Initialize character buffer. */
memset(&buf, '�', 513);
len = 0;

return(0);
}

int close(struct cdev *dev, int flag, int otyp, struct thread *td)
{
return(0);
}

int write(struct cdev *dev, struct uio *uio, int ioflag)
{
int error = 0;

/*
* Take in a character string, saving it in buf.
* Note: The proper way to transfer data between buffers and I/O
* vectors that cross the user/kernel space boundary is with
* uiomove(), but this way is shorter. For more on device driver I/O
* routines, see the uio(9) manual page.
*/
error = copyinstr(uio->uio_iov->iov_base, &buf, 512, &len);
if (error != 0)
uprintf("Write to "cd_example" failed.n");

return(error);
}

int read(struct cdev *dev, struct uio *uio, int ioflag)
{
int error = 0;

if (lenuio_iov->iov_base, 513, &len);

return(error);
}

/* Reference to the device in DEVFS */
static struct cdev *sdev;

static int load(struct module *module, int cmd, void *arg)
{
int error = 0;

switch (cmd)
{
case MOD_LOAD:
sdev = make_dev(&cd_example_cdevsw, 0, UID_ROOT, GID_WHEEL, 0600, "cd_example");
uprintf("Character device loaded.n");
break;

case MOD_UNLOAD:
destroy_dev(sdev);
uprintf("Character device unloaded.n");
break;

default:
error = EOPNOTSUPP;
break;
}

return(error);
}

DEV_MODULE(cd_example, load, NULL);

FreeBSD Kernel Rootkit Design Howtos - 4 - Kernel and User Space Transitions

Original Posted @ 10/06/2012

What's up geeks, it's good to see you again! This is the 4th tutorial on FreeBSD Kernel Rootkit Design Howtos. Check back everyday for new tutorials and examples.

Just in case you haven't read the previous ones, please do it now. Get back to menu: FreeBSD Kernel Rootkit Design Howtos – Introduction for a list of all posts.

The Review

Well, in the previous session, we discussed about an upgraded version of client application to issue system calls. We used int modfind(const char *modname); to get the id of a kernel module by specifying module name, and then we successfully retrieved the module's offset value by calling int modstat(int modid, struct module_stat *stat); function. We learned that the offset value is actually stored as int intval; in typedef union modspecific which is a sub-union of struct module_stat.

Check back to previous session if you are unsure about what I'm talking about.

Now, that last session was all about calling system call module in a flexible way, but as a prerequisite for this session, I want you to try to remember what we were talking about in FreeBSD Kernel Rootkit Design Howtos - 2 - System Call First Kernel Service Module

I hope you still remember that we created our very first system call module that expects a string argument to be sent from user land application, and it will then print out the string received. You don't have to recall the whole process since we'll redo that in this session, what you do need to recall is that we made a mistake.

modern operating systems segregate it’s memory areas into user space and kernel space, code running in each section don’t directly access each other’s resources. The way we assign a user space structure pointer to a kernel space local variable (uap = (struct sc_example_args *)syscall_args;) is unsafe and not recommended.

We will deal with this problem and fix that example code in this session. So do check back to FreeBSD Kernel Rootkit Design Howtos - 2 - System Call First Kernel Service Module to get an idea of the context.

Kernel and User Space Transition

Before we start fixing our example code, I'd like to talk a little bit more about kernel space and user space.

We all know that modern operating systems segregate their virtual memory into user space and kernel space, and commonly user-mode applications run in user space, and system kernel runs in kernel space. Now, the question is why, why such separation exists, what is its purpose.

Actually we can consider user space as isolated sand-boxes, it restricts user land applications so that they don't mess up with each other's resources, and most importantly, the kernel space.

As we know, the kernel is the core of any operating system that is in charge of hardware resource allocation, scheduling, I/O, process management, and all sorts of low level stuff. We want to maintain stability and security of the system, so we really don't want poorly-designed user applications crashing the whole system or malicious user application to modify the behavior of the kernel.

Although it it restricted, but it's possible for user space and kernel space to access each other's virtual memory since such communications are necessary. We just have to make sure that we do these in a way that the impact on system stability and security are minimal.

  • User space CANNOT access kernel space, if you have to, issue a system call
  • Kernel space CAN access user space, because it literally owns everything of the system
  • However, it is recommended to copy memory area from user space to kernel space before access it, otherwise may result a fatal panic

chart4_1

Look at the figure above, it simply says that user-mode applications can only access kernel resources via system calls, and the system call module will return a value or perform pre-defined actions. This is much safer than letting user-mode applications to freely modify kernel resources or calling kernel functions, either kernel resource changing or function calling are performed by pre-programmed and well-designed code running in kernel, which are considered safer and stabler.

The figure also says that kernel modules running in kernel space normally copy user space resources back to it's own virtual memory area first, and then perform pre-defined actions to either return a value to user space or modify the resource content.

Copying user space resources back to kernel space before referring or modifying is safer and recommended because, if we refer to a user space resource directly from kernel space and it's swapped out or not faulted in yet, we'll trigger a fatal panic.

That's why I said in FreeBSD Kernel Rootkit Design Howtos - 2 - System Call First Kernel Service Module that by doing uap = (struct sc_example_args *)syscall_args; to point a kernel space pointer to a user space string and print it out is a huge mistake.

Now let's figure out how to copy resources around virtual memory areas.

The COPY(9)

Open your terminal and run the following command to see the manpage for all copy functions

myBSD# man copy

Copy functions are copy, copyin, copyout, and copyinstr. What they generally do is

The copy functions are designed to copy contiguous data from one address to another. All but copystr() copy data from user-space to kernel-space
or vice-versa.

I find these four function names a little bit confusing, so let me put a figure here to help you remember their usages.

chart4_2

The following is a list of all 4 copy functions definition and usage

  • copyin()

    • int copyin(const void *uaddr, void *kaddr, size_t len);

    • Copies len bytes of data from the user-space address uaddr1 to the kernel-space address kaddr1.

  • copyout()

    • int copyout(const void *kaddr, void *uaddr, size_t len);

    • Copies len bytes of data from the kernel-space address kaddr2 to the user-space address uaddr2.

  • copyinstr()

    • int copyinstr(const void *uaddr, void *kaddr, size_t len, size_t *done);

    • Copies a NUL-terminated string, at most len bytes long, from user-space address uaddr3 to kernel-space address kaddr3. The number of bytes actually copied, including the terminating NUL, is returned in *done (if done is non-NULL).

  • copystr()

    • int copystr(const void *kfaddr, void *kdaddr, size_t len, size_t *done);

    • Copies a NUL-terminated string, at most len bytes long, from kernel-space address kaddr4 to kernel-space address kaddr5. The number of bytes actually copied, including the terminating NUL, is returned in *done (if done is non-NULL).

You can see that among these four copy functions, copystr() copies from kernel space to kernel space, copyin() and copyinstr() copy memory from user space to kernel space, and copyout() copies memory from kernel space to user space.

The Copyin Functions

From the perspective of a system call module, we need to copy memory from user space to kernel space at most of the times. So that there are two copy functions we can utilize to achieve this task, copyin() and copyinstr().

It is necessary to talk about the difference between copyin() and copyinstr() before we choose which one to use for our upgraded example system call module.

Characteristics copyin() copyinstr()
Direction From user space to kernel space From user space to kernel space
Resource Type Any user space memory address NUL-terminated string (ends with \0)
Copy Length Exactly len bytes At most len bytes long
Extra Argument NONE size_t *done stores the number of bytes actually copied, including the terminating NUL
Return Value 0 = success EFAULT = bad address 0 = success EFAULT = bad address
Extra Return Value ENAMETOOLONG = the string is longer than len bytes ENAMETOOLONG = the string is longer than len bytes

As you can see from the table above, that these two functions perform basically same actions, but fits under different scenarios. As we need to copy a string from user space to kernel space in our example system call module, we are gonna use copyinstr() this time.

The Example

Finally here comes our example code, just remember to include two extra lib files.

/*
* FILE: /root/rootkit/4.1/safe_sc_example.c
* Example 4.1
* User and Kernel Space Transitions 
* FreeBSD Rootkit Design Howtos @ www.hailang.me
*/
#include <sys/types.h>
#include <sys/param.h>
#include <sys/proc.h>
#include <sys/module.h>
#include <sys/sysent.h>
#include <sys/kernel.h>
#include <sys/systm.h>
#include <sys/sysproto.h>

/* The system call's arguments. */
struct sc_example_args {
    char *str;
};

/* The system call function. */
static int sc_example(struct thread *td, void *syscall_args)
{
    struct sc_example_args *uap;
    uap = (struct sc_example_args *)syscall_args;

    char kernstr[1024+1]; //This is the place holds a copy of the string in kernel space
    int err = 0; //Return stat
    size_t size = 0; //Size actually copied

    err = copyinstr(uap->str, &kernstr, 1024, &size);
    if (err == EFAULT)
        return(err);

    printf("Safer version output: %s\n",kernstr);
    return (0);
}

/* The sysent for the new system call */
static struct sysent sc_example_sysent = {
    1,  /* number of arguments */
    sc_example  /* implementing function */
};

/* The offset in sysent[] where the system call is to be allocated. */
static int offset = NO_SYSCALL;

/* The function called at load/unload. */
static int load(struct module *module, int cmd, void *arg)
{
    int error = 0;

    switch(cmd) {
    case MOD_LOAD:
        uprintf("System call loaded at offset %d.\n", offset);
        break;
    case MOD_UNLOAD:
        uprintf("System call unloaded from offset %d.\n", offset);
        break;
    default:
        error = EOPNOTSUPP;
        break;
    }

    return(error);
}

SYSCALL_MODULE(sc_example, &offset, &sc_example_sysent, load, NULL);

Save the above code as safe_sc_example.c, and the following as makefile in the same directory

KMOD=   safe_sc_example
SRCS=   safe_sc_example.c

.include <bsd.kmod.mk>

Now we can build our safer system call example by issuing make command

myBSD# make
Warning: Object directory not changed from original /root/rootkit/4.1
@ -> /usr/src/sys
machine -> /usr/src/sys/amd64/include
x86 -> /usr/src/sys/x86/include
cc -O2 -pipe -fno-strict-aliasing -Werror -D_KERNEL -DKLD_MODULE -nostdinc   -I. -I@ -I@/contrib/altq -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-common  -fno-omit-frame-pointer  -mno-sse -mcmodel=kernel -mno-red-zone -mno-mmx -msoft-float  -fno-asynchronous-unwind-tables -ffreestanding -fstack-protector -std=iso9899:1999 -fstack-protector -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes  -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual  -Wundef -Wno-pointer-sign -fformat-extensions  -Wmissing-include-dirs -fdiagnostics-show-option   -c safe_sc_example.c
ld  -d -warn-common -r -d -o safe_sc_example.ko safe_sc_example.o
:> export_syms
awk -f /sys/conf/kmod_syms.awk safe_sc_example.ko  export_syms | xargs -J% objcopy % safe_sc_example.ko
objcopy --strip-debug safe_sc_example.ko

Check if you have successfully built safe_sc_example.ko file, and load it into the running kernel by

myBSD# kldload ./safe_sc_example.ko
System call loaded at offset 210.

Great, now you can use the client application we built in the previous session to make the system call. Just in case you feel as lazy as I do, here's the client application code again,

#include <stdio.h>
#include <sys/syscall.h>
#include <sys/types.h>
#include <sys/module.h>

int main(int argc, char *argv[]) //Main function of our client application
{
    if (argc != 2) { //Check the parameter passed to the application, 
                     //print out usage help information 
                     //if we got wrong number of parameters.
        printf("Usage:\n%s <string>\n", argv[0]);
        exit(0);
    }

    struct module_stat stat; //We'll get the module statues and store it here.
    int syscall_num; //We'll get the offset value and store it here.


    /* Determine sc_example's offset value. */
    stat.version = sizeof(stat); //set version to sizeof(struct module_stat)
    modstat(modfind("sys/sc_example"), &stat); //With prefix this time
    syscall_num = stat.data.intval;

    /* Call sc_example. */
    return(syscall(syscall_num, argv[1])); //Return the statues of the system call
}

Save the source code in client.c and build it by

myBSD# gcc -o client client.c
client.c: In function 'main':
client.c:12: warning: incompatible implicit declaration of built-in function 'exit'

Now we can officially call our safer system call module example by

myBSD# ./client May\ the\ source\ be\ with\ you
myBSD# dmesg | tail -n 1
Safer version output: May the source be with you

Here's how it looks like

chart4_3

This is great! We fine-tuned our system call module with the powerful copy functions, this should be enough for now. User and Kernel space transition is a very big topic, it involves many kernel design aspects and concepts, I cannot cover all of them in this short tutorial, but I'll sure get back to this once we need to.

Hope you enjoy this tutorial, leave a comment below to let me know your suggestions, and May the source be with you!

Back to Menu: FreeBSD Kernel Rootkit Design Howtos – Introduction

Recommended Books

FreeBSD Kernel Rootkit Design Howtos - 3 - System Call First Kernel Service Application

Original Posted @ 09/06/2012

Welcome back to FreeBSD Kernel Rootkit Design Howtos, we’ll walk through all necessary techniques you need to program your own BSD kernel rookit. Please be sure you’ve read the previous guides before you proceed with this one.

Get back to FreeBSD Kernel Rootkit Design Howtos – Introduction for a complete table of contents.

The Review

Same as usual, let’s review what we’ve discussed in the last session. Basically a new kind of kernel module was introduced — The system call module, it registers itself as a kernel service and wait for user to call it via system calls.

SYSCALL_MODULE is the registration macro for system call modules to install themselves into the running kernel, and it requires five parameters:

  • name as a generic name for the system call module

  • offset value which determines in which location in the sysent[] table to save our system call module’s sysent structure

  • new_sysent pointer to a sysent_t structure that contains basic info about the system call module such as number of arguments and pointer to it’s implementation function

  • implementation function defines activities you want your system call module to perform every time when it receives system call

Again, please make sure that you know what I'm talking about before you proceed with this tutorial, you may want to check out the last session if you are unsure of some of the above terms.

The Client Application

I promised that there will be a client application to issue system calls to our system call module instead of the long and nasty perl command we used in last session. And yep, that is what we are gonna do today.

It will better and nicer because,

  • It'll be a real client, not a command
  • It'll be flexible, meaning we don't have to know the offset value of the system call module to send command to it

Take this as our objective, and now let's figure out how to achieve it.

The modfind Function

The modfind function is our key to solve the inflexibility issue, it is very useful but a little bit confusing.

As the name implies, the modfind function helps us to find a specific module in a running kernel by giving it the module name. It is sweet ’cause keeping track of module names is much easier than a bunch of module offset values.

Now the confusing part is, opposite to what you may have guessed, the return value of this function is NOT the system call module’s offset value, but it’s id. Keep this in mind but don’t worry about it by now, we’ll talk more on this later.

Let’s take a look at a sample piece of modfind function code.

#include <sys/param.h>
#include <sys/module.h>

int modfind(const char *modname);

The modstat Function

As you know that we can only obtain the id of the system call module by calling modfind function, so here's what really give us answers - the modstat function.

#include <sys/param.h>
#include <sys/module.h>

int modstat(int modid, struct module_stat *stat);

The int modid is where we should pass the result of modfind function, but we won't get the offset value from the function's return value.

Instead, we have to construct a module_stat structure, and the modstat function will save results to it. That is why we are passing a pointer to a module_stat structure as the second parameter.

The module_stat structure is defined in sys/module.h as shown below

FILE: /usr/src/sys/sys/module.h

struct module_stat {
        int             version;        /* set to sizeof(struct module_stat) */
        char            name[MAXMODNAME];
        int             refs;
        int             id;
        modspecific_t   data;
};

typedef union modspecific {
        int     intval;
        u_int   uintval;
        long    longval;
        u_long  ulongval;
} modspecific_t;

I know this looks messy, but luckily we don't need to deal with all of them.

The first part of the code is the definition of module_stat structure, and it contains a modspecific_t union which is defined in the second part, our offset value is stored in this union as intval.

Confusing right, you don't see any variable name here with offset value in it. The good news is, that's all we have to learn today, we'll get to the example code after the summary.

The Summary

We now know that having a native client to call system call modules is much better than the perl command we used in last session. It is better because,

  • We don't have to remember the offset value of the system call module to call it. This is especially useful when we have to maintain multiple system call modules

  • The offset value of system call module changes every time we reload it. We'll still have our flexibility since we rely on module name which is not likely to be changed over time

The way how the client works is that we use modfind to find out the id of a given module name, but sadly it’s not the module’s offset value, which we can’t use to issue the system call.

So we have to utilize another function which is modstat that will give us back a modspecific union, in which we can get the real offset value we want from the intval integer.

Here's a figure to make your life miserable, you can thank me later. : ] Oh god, why I'm so bad at drawing

chart3_1

The Example

Now it’s time for us to see some examples. Let’s firstly start with the example from Designing BSD Rootkits: An Introduction to Kernel Hacking with some additional comments to help you understand the code.

#include <stdio.h>
#include <sys/syscall.h>
#include <sys/types.h>
#include <sys/module.h>

int main(int argc, char *argv[]) //Main function of our client application
{
    if (argc != 2) { //Check the parameter passed to the application,
                     //print out usage help information
                     //if we got wrong number of parameters.
        printf("Usage:\n%s <string>\n", argv[0]);
        exit(0);
    }

    struct module_stat stat; //This is the module_stat structure
    int syscall_num; //We'll get the offset value and store it here.


    /* Determine sc_example's offset value. */
    stat.version = sizeof(stat); //set version to sizeof(struct module_stat)
    modstat(modfind("sc_example"), &stat);
    syscall_num = stat.data.intval;

    /* Call sc_example. */
    return(syscall(syscall_num, argv[1])); //Return the statues of the system call
}

There's just one thing you may not know in the code above, that is the syscall function. It's very simple, all we have to do is to give it the offset value and the arguments we want to pass to that system call.

syscall(syscall_num, para);

Unlike kernel modules which we need a makefile to automate some extreme nasty stuff, we don’t need one for this simple client side application. We can directly compile the code like this,

myBSD# gcc -o client client.c
client.c: In function 'main':
client.c:12: warning: incompatible implicit declaration of built-in function 'exit'
myBSD# ls
client      client.c

Looks good so far, we’ve got the client executable file. Let’s load the previous sc_example module and try to call it with our client application.

myBSD# kldload ./sc_example.ko
System call loaded at offset 210.
myBSD# ./client
Usage:
./client <string>
myBSD# ./client Hey\ Kernel!
Bad system call (core dumped)

We loaded the sc_example module at offset value of 210, we got a nice help message when wrong number of parameter is specified, which is nice, but we end up with a Bad system call error when we give it the right amount of parameters.

This is bad, because we’ve got nearly no clue of what had happened, and what caused it. But just before we panic, let’s calm down and think about this error message which is our only lead to the problem, which says Bad system call.

It is obvious that we were calling a bad kernel module, so bad that our little client crashed. If we called a bad kernel module in our syscall function, then that means the return value of modfind or modstat function is wrong, or maybe both of them are wrong, who knows.

So I did a little bit debugging, and I soon realized that the return value of modfind is -1, which is obvious bad.

#include <stdio.h>
#include <sys/syscall.h>
#include <sys/types.h>
#include <sys/module.h>

int main(int argc, char *argv[]) //Main function of our client application
{
...
    printf("The return value of modfind is: %d\n", modfind("sc_example"));
...
}

The result says

myBSD# gcc -o client client.c
myBSD# ./client
The return value of modfind is: -1

Alright, -1, it probably means fail to find this module, or given module was not found. Given that the module name is the only parameter for the modfind function, it is obvious that the module name is wrong.

This is interesting, we all know that the module name is right, so there’s something wrong from the very beginning. The very source of a system call module is of course it’s declaration macro, so let’s look at there and see if we can get any luck.

FILE:/usr/src/sys/sys/sysent.h
#define SYSCALL_MODULE(name, offset, new_sysent, evh, arg)      \
static struct syscall_module_data name##_syscall_mod = {        \
        evh, arg, offset, new_sysent, { 0, NULL, AUE_NULL }     \
};                                                              \
                                                                \
static moduledata_t name##_mod = {                              \
        "sys/" #name,                                           \
        syscall_module_handler,                                 \
        &name##_syscall_mod                                     \
};

Here we go, a module name prefix! That means every kernel module registered with SYSCALL_MODULE will be given a sys/ prefix in front of their names.

Obviously this is something new, something that dean't exist when Designing BSD Rootkits: An Introduction to Kernel Hacking was published. Maybe the FreeBSD development team is trying to categorize different kinds of kernel modules by giving prefixes.

Now we know the root of the problem, let's modify our code and try again.

#include <stdio.h>
#include <sys/syscall.h>
#include <sys/types.h>
#include <sys/module.h>

int main(int argc, char *argv[]) //Main function of our client application
{
    if (argc != 2) { //Check the parameter passed to the application,
                     //print out usage help information
                     //if we got wrong number of parameters.
        printf("Usage:\n%s <string>\n", argv[0]);
        exit(0);
    }

    struct module_stat stat; //We'll get the module statues and store it here.
    int syscall_num; //We'll get the offset value and store it here.


    /* Determine sc_example's offset value. */
    stat.version = sizeof(stat); //set version to sizeof(struct module_stat)
    modstat(modfind("sys/sc_example"), &stat); //With prefix this time
    syscall_num = stat.data.intval;

    /* Call sc_example. */
    return(syscall(syscall_num, argv[1])); //Return the statues of the system call
}

Everything is the same except modstat(modfind("sys/sc_example"), &stat);. Now compile and run.

myBSD# gcc -o client client.c
client.c: In function 'main':
client.c:12: warning: incompatible implicit declaration of built-in function 'exit'
myBSD# ./client Hey\ Kernel!\ What\'s\ Up\?
myBSD# dmesg | tail -n1
Hey Kernel! What's Up?

Here's what it looks like

chart3_2

Yada! Our little client is doing it's job and we've achieved our objectives by calling a system call module without knowing it's offset value.

Let's do a little experiment before we call this a day, I need to prove something.

A Little Tiny Experiment

Still remember how I over emphasized the modfind function’s return value id is NOT the module’s real offset value like crazy? Here’s a little tiny example to prove that point.

#include <stdio.h>
#include <sys/syscall.h>
#include <sys/types.h>
#include <sys/module.h>

int main(int argc, char *argv[]) //Main function of our client application
{
    struct module_stat stat;
    int module_id;
    stat.version = sizeof(stat);

    module_id = modfind("sys/sc_example");
    modstat(module_id, &stat);
    printf("The module_id of sys/sc_example is:%d, but it's offset value is:%d\n", module_id, stat.data.intval);
}

And this is how truth is proved

myBSD# gcc -o exp exp.c
myBSD# ./exp
The module_id of sys/sc_example is:451, but it's offset value is:210

Alright, that will be all for today, thank you for reading this tutorial, like it and share it to support my work, and I'll see you in the next tutorial!

Back to Menu: FreeBSD Kernel Rootkit Design Howtos – Introduction

Recommended Books

FreeBSD Kernel Rootkit Design Howtos - 2 - System Call First Kernel Service Module

Original Posted @ 09/06/2012

So, this is the second tutorial on FreeBSD Kernel Rootkit Design. I hope you have worked through the previous one before you continue, which is obviously a prerequisite.

What we’ve discussed are that KLD is the way we interact with kernel, and how to declare a module by having module name, module data (consists of official name and event handler function), sub, and order. And we agree on the not-completely-true assumption that every kernel module should have an event handler function which deals with event type such as MOD_LOAD, MOD_UNLOAD, and so on. If any of these terms sounds strange to you, I encourage you to go back and review FreeBSD Kernel Rootkit Design Howtos - 1 - KLD First Kernel Loadable Module.

The System Call Module

Now today we're gonna talk about the system call module, which is a little bit different compare to the general module we've discussed previously.

A general module which we’ve talked about in the last session performs programmed actions only when certain actions take place, such as when it loads, unloads, shutdown, and etc.

A system call module on the other hand, is basically as same as the general KLD module, except that it installs itself as a kernel service request, and then listen to certain signals to perform programmed actions accordingly.

Such functions can be considered as kinda of a bridge between the kernel space and the user space, which enables the ability for its users to send signals to the kernel and make it react accordingly.

What makes this system call module different is that, instead of printing messages every time we load or unload the module, we're gonna make it print messages every time we send command to it.

Here in this session, we’re gonna talk about the system call module, its structure, its declaration routine, and finally write our first system call module along with a tiny client application to send command to it.

The System Call Function

A system call function is a function defined in the system call module, which contains a list of actions to be taken every time it receives a system call.

It's similar to the module event handler except that we have control over what command to receive and what actions to perform.

The prototype of system call function is defined in sys/sysent.h and is shown below.

FILE:/usr/src/sys/sys/sysent.h

typedef int     sy_call_t(struct thread *, void *);

The struct struct thread * points to the current running thread, which you don’t have to care about it at this stage. The void * points to the structure of system call’s arguments.

Compare to the general KLD module, the system call module can receive multiple arguments instead of limited and pre-defined ones. So it is your responsibility to define the arguments that the system call module needs to deal with.

Since the system call's arguments are wrapped in a struct, so we can define it like this:

struct sc_example_args {
    char *str;
};

By doing this, we defined a struct called sc_example_args that contains one parameter char *str.

Having the system call's arguments struct successfully defined, we can now declare our system call function like this:

static int sc_example(struct thread *td, void *syscall_args) {
    struct sc_example_args *uap;
    uap = (struct sc_example_args *)syscall_args;
    printf("%s\n", uap->str);
    return(0);
}

The first line is obvious the declaration of the system call function, note that we can receive all arguments via void *syscall_args inside the function.

Let's now take a look at what happens inside the function, we firstly initialized (It's not the precise term, but it helps) a local variable *uap using our defined sc_example_args structure.

And then convert the incoming arguments from *syscall_args to match our standard, the sc_examples_args structure, and let the *uap pointer points to it. The simple version of this, is we receive arguments from *syscall_args and save it in *uap with the sc_example_args format.

Now we can do whatever we want with the arguments received, such as print out the string like this: printf("%s\n", uap->str);

Looks we have successfully declared a system call function, but we actually just made a huge mistake.

What I mean by mistake is that the code can still be compiled and executed, but we did it in a very bad manner. You see that modern operating systems segregate it's memory areas into user space and kernel space, code running in each section don't directly access each other's resources. The way we assign a user space structure pointer to a kernel space local variable (uap = (struct sc_example_args *)syscall_args;) is unsafe and not recommended.

Here’s a quote from Designing BSD Rootkits: An Introduction to Kernel Hacking that explains a little bit about kernel space and user space.

FreeBSD segregates its virtual memory into two parts: user space and kernel space. User space is where all user-mode applications run, while kernel space is where the kernel and kernel extensions (i.e., LKMs) run. Code running in user space cannot access kernel space directly (but code running in kernel space can access user space). To access kernel space from user space, an application issues a system call.

I don't wanna frighten you off by talking too much about the kernel/user space transition, we'll use the above code first and get back later when we are ready.

The sysent Structure

Still remember the general module declaration macro in the previous session? Well the system call modules need to register themselves by calling a macro as well, but we have to define a sysent structure first and then pass it to the declaration macro.

The sysent structure is similar to the moduledata that we’ve discussed in the last session, it contains the basic information about the system call. So that once we register a system call module with sysent structure, the operating system will know where and how to quickly fire it.

The FreeBSD system actually maintains a table of sysent structures of all system call modules that are currently loaded in the running kernel, thus every system call module has to provide its sysent structure during initialization to register itself with the sysent table.

So be sure that you understand how sysent structure differs from sysent table before we take a look at its definition in sys/sysent.h

FILE:/usr/src/sys/sys/sysent.h

struct sysent {                 /* system call table */
        int     sy_narg;        /* number of arguments */
        sy_call_t *sy_call;     /* implementing function */
        au_event_t sy_auevent;  /* audit event associated with syscall */
        systrace_args_func_t sy_systrace_args_func;
                                /* optional argument conversion function. */
        u_int32_t sy_entry;     /* DTrace entry ID for systrace. */
        u_int32_t sy_return;    /* DTrace return ID for systrace. */
        u_int32_t sy_flags;     /* General flags for system calls. */
        u_int32_t sy_thrcnt;
};
…
extern struct sysent sysent[];

I guess the comments in above code explain exactly what you need to know. Note that normally we just need to specify sy_narg and *sy_call for this to work.

Now we can extend our previous example code as following

struct sc_example_args {
    char *str;
};

static int sc_example(struct thread *td, void *syscall_args) {
    struct sc_example_args *uap;
    uap = (struct sc_example_args *)syscall_args;
    printf("%s\n", uap->str);
    return(0);
}

static struct sysent sc_example_sysent = {
    1,              /* number of arguments */
    sc_example      /* implementing function */
};

The Offset Value

Same as the system call function and the sysent structure, the offset value is another parameter you need to set and pass to the system call module declaration macro. Basically, the offset value is the system call module’s number, which will be used by the system to refer to its sysent structure in the sysent table.

It should be an unique integer, and should be explicitly declared in a system call’s declaration macro. It is considered as a good practice to not to assign fixed numbers to dynamic system call modules. Instead, we can ask the system to dynamically assign an unused offset number for our system call module by doing this:

static int offset = NO_SYSCALL;

NO_SYSCALL is a constant, meaning the next available slots offset in sysent table.

Just in case if you are interested, the value for NO_SYSCALL is -1 as shown below:

FILE:/usr/src/sys/sys/sysent.h

#define NO_SYSCALL (-1)

Some of the pre-defined system call offsets are listed in the /sys/kern/syscalls.master file, here's some allocations:

Offset Range Comment
0-150 Reserved/unimplemented system calls. For use in future Berkeley releases.
151-180 Reserved for vendor-specific system calls
181-199 Used by/reserved for BSD
210-219 Reserved for loadable syscalls
220-249 Were introduced with NetBSD/4.4Lite-2
250-299 Initially used in OpenBSD
300-531 Syscall numbers for FreeBSD

The SYSCALL_MODULE Macro

I said at the beginning of this tutorial that we are gonna need to call a macro to declare a system call module, but we needed to know few other things first. We talked about system call function, we talked about sysent structure and we talked about the offset value.

The thing is, we are gonna need these declared first and only then we can call the SYSCALL_MODULE macro. It is defined in sys/sysent.h as following

FILE:/usr/src/sys/sys/sysent.h

#define SYSCALL_MODULE(name, offset, new_sysent, evh, arg)

Different from the DECLARE_MODULE macro which requires four parameters: name, data, sub, and order, the SYSCALL_MODULE requires five parameter to be passed, which are:

name

This specifies the generic module name, which is passed as a character string.

offset

This specifies the system call’s offset value, which is passed as an integer pointer.

new_sysent

This specifies the completed sysent structure, which is passed as a struct sysent pointer.

evh

This specifies the event handler function.

arg

This specifies the arguments to be passed to the event handler function. For our purposes, we’ll always set this parameter to NULL.

Great, we can now further extend our previous example code as following

struct sc_example_args {
    char *str;
};

static int sc_example(struct thread *td, void *syscall_args) {
    struct sc_example_args *uap;
    uap = (struct sc_example_args *)syscall_args;
    printf("%s\n", uap->str);
    return(0);
}

static struct sysent sc_example_sysent = {
    1,              /* number of arguments */
    sc_example      /* implementing function */
};

static int offset = NO_SYSCALL;

SYSCALL_MODULE(sc_example, &offset, &sc_example_sysent, evh, NULL);

Don't worry about the event handler function evh, it will be exactly same as general module's event handler function. You'll see that in the complete example soon, for now let's sum things up first.

The Summary

  • System Call Module is another type of kernel loadable module

  • It installs itself in the kernel space and perform programmed activities according to signals received from user space

  • In order to declare a system call module, five parameters are required, name, offset, new_sysent, evh, and arg

    • name is the general module name

    • offset is the system call module’s offset number

      • It determines where to store the system call module's sysent structure in sysent[] table
    • new_sysent is a pointer to the system call module’s sysent structure

      • sysent structure is similar to moduledata

      • It contains basic information about the system call module including

        • Number of arguments it expects

        • And implementing function

      • The implementation function, also known as the system call function

      • It contains a list of actions to be taken every time it receives a particular signal

The Example

We can now write the our first system call module, take a look at the following code, there are some comments there to help you understand. There is absolutely nothing new except the event handler function, which in fact, isn't new to us as well.

/*
 * FILE: /root/rootkit/2.1/sc_example.c
 * Example 2.1
 * The First System Call Module
 * FreeBSD Rootkit Design Howtos @ www.hailang.me
*/
#include <sys/types.h>
#include <sys/param.h>
#include <sys/proc.h>
#include <sys/module.h>
#include <sys/sysent.h>
#include <sys/kernel.h>
#include <sys/systm.h>
#include <sys/sysproto.h>

/* The system call's arguments. */
struct sc_example_args {
    char *str;
};

/* The system call function. */
static int sc_example(struct thread *td, void *syscall_args)
{
    struct sc_example_args *uap;
    uap = (struct sc_example_args *)syscall_args;

    printf("%s\n",uap->str);

    return (0);
}

/* The sysent for the new system call */
static struct sysent sc_example_sysent = {
    1,          /* number of arguments */
    sc_example  /* implementing function */
};

/* The offset in sysent[] where the system call is to be allocated. */
static int offset = NO_SYSCALL;

/* The function called at load/unload. */
static int load(struct module *module, int cmd, void *arg)
{
    int error = 0;

    switch(cmd) {
    case MOD_LOAD:
        uprintf("System call loaded at offset %d.\n", offset);
        break;
    case MOD_UNLOAD:
        uprintf("System call unloaded from offset %d.\n", offset);
        break;
    default:
        error = EOPNOTSUPP;
        break;
    }

    return(error);
}

/* Declare the System Call Module */
SYSCALL_MODULE(sc_example, &offset, &sc_example_sysent, load, NULL);

As usual, we need to have a makefile in the same directory as the source code file.

KMOD=   sc_example
SRCS=   sc_example.c

.include <bsd.kmod.mk>

Now build your first system call module by using the make command in the same directory.

myBSD# make
Warning: Object directory not changed from original /root/rootkit/2.1
@ -> /usr/src/sys
machine -> /usr/src/sys/amd64/include
x86 -> /usr/src/sys/x86/include
cc -O2 -pipe -fno-strict-aliasing -Werror -D_KERNEL -DKLD_MODULE -nostdinc   -I. -I@ -I@/contrib/altq -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-common  -fno-omit-frame-pointer  -mno-sse -mcmodel=kernel -mno-red-zone -mno-mmx -msoft-float  -fno-asynchronous-unwind-tables -ffreestanding -fstack-protector -std=iso9899:1999 -fstack-protector -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes  -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual  -Wundef -Wno-pointer-sign -fformat-extensions  -Wmissing-include-dirs -fdiagnostics-show-option   -c sc_example.c
ld  -d -warn-common -r -d -o sc_example.ko sc_example.o
:> export_syms
awk -f /sys/conf/kmod_syms.awk sc_example.ko  export_syms | xargs -J% objcopy % sc_example.ko
objcopy --strip-debug sc_example.ko

myBSD# ls
@       export_syms machine     makefile    sc_example.c    sc_example.ko   sc_example.o    x86

We have successfully compiled our first system call module and we got the sc_example.ko file!

The Loading and the Calling

Here’s the final step we need to take to make use of our first system call module, the loading, and the calling. Let’s firstly try to load the module into the running kernel, and then figure out how to issue the system call.

myBSD# kldload ./sc_example.ko
System call loaded at offset 210.

Thanks to our event handler, it prints out the system call number which is the offset value of the system call module's sysent structure in sysent[] table. You'll soon realize how important it is for us to issue a system call.

Now we have two ways to send command to our system call module, we can either write a user space application, or type a simple command. I will talk about the command first since the user space application will be covered in next session.

myBSD# kldload ./sc_example.ko
System call loaded at offset 210.
myBSD# perl -e '$str = "Hello kernel!\n I am here to dance with you!";' -e 'syscall(210, $str);'
myBSD# dmesg | tail -n 2
Hello kernel!
 I am here to dance with you!

Note that we explicitly specified the system call number in that perl command to send our command to our system call module.

That's all for today, we made an upgraded version of fun kernel printing tookit which can print whatever string you want it to. Play with it and try to digest all these new terminologies.

We'll talk about how to call a system call module without knowing it's offset value in the next tutorial. See you there.

Back to Menu: FreeBSD Kernel Rootkit Design Howtos – Introduction

Recommended Books

Largest Product In A Series

Original Posted @ 04/08/2013


Find the greatest product of five consecutive digits in the 1000-digit number.

x = """73167176531330624919225119674426574742355349194934
96983520312774506326239578318016984801869478851843
85861560789112949495459501737958331952853208805511
12540698747158523863050715693290963295227443043557
66896648950445244523161731856403098711121722383113
62229893423380308135336276614282806444486645238749
30358907296290491560440772390713810515859307960866
70172427121883998797908792274921901699720888093776
65727333001053367881220235421809751254540594752243
52584907711670556013604839586446706324415722155397
53697817977846174064955149290862569321978468622482
83972241375657056057490261407972968652414535100474
82166370484403199890008895243450658541227588666881
16427171479924442928230863465674813919123162824586
17866458359124566529476545682848912883142607690042
24219022671055626321111109370544217506941658960408
07198403850962455444362981230987879927244284909188
84580156166097919133875499200524063689912560717606
05886116467109405077541002256983155200055935729725
71636269561882670428252483600823257530420752963450
"""
x = x.replace('\n', '')

This is the 8th question from Project Euler. Given a 1000-digit number, we need to get the maximum product of 5 consecutive digits.

By five consecutive digits, it means the (7, 3, 1, 6, 7), (3, 1, 6, 7, 1), .... The game plan is to write a simple loop, walk through this 1000-digit number and see which 5 yeilds the greatest product.

Note that the last group of 5 consecutive digits is (6, 3, 4, 5, 0) which starts at the index len(x) - 4

x = """73167176531330624919225119674426574742355349194934
96983520312774506326239578318016984801869478851843
85861560789112949495459501737958331952853208805511
12540698747158523863050715693290963295227443043557
66896648950445244523161731856403098711121722383113
62229893423380308135336276614282806444486645238749
30358907296290491560440772390713810515859307960866
70172427121883998797908792274921901699720888093776
65727333001053367881220235421809751254540594752243
52584907711670556013604839586446706324415722155397
53697817977846174064955149290862569321978468622482
83972241375657056057490261407972968652414535100474
82166370484403199890008895243450658541227588666881
16427171479924442928230863465674813919123162824586
17866458359124566529476545682848912883142607690042
24219022671055626321111109370544217506941658960408
07198403850962455444362981230987879927244284909188
84580156166097919133875499200524063689912560717606
05886116467109405077541002256983155200055935729725
71636269561882670428252483600823257530420752963450
"""
x = x.replace('\n', '')

if __name__ == '__main__':
    max_product = -1
    for i in range(len(x) - 4):
        product = int(x[i]) * int(x[i+1]) * int(x[i+2]) * \
                int(x[i+3]) * int(x[i+4])
        if product > max_product:
            max_product = product
    print max_product

Very straight-forward code. Taken from pastebin and modified a little bit.
Of course there are simpler and more pythonic ways, but since
this is a simple question, I'll just be lazy and ignore the possible ugliness of this code.

The answer is 40824, if you print out the five digits, they are, (9, 9, 8, 7, 9), starting at index 364.

<< EOF