我想使用 netlink 套接字和 taskstats 读取单个线程的 Linux 内核统计信息。
我可以使用 Python 包装器(https://github.com/facebook/gnlpy)让 taskstats 工作,但我想做一个 C 实现。
设置套接字、消息参数并发送后,接收方
nl_recvmsgs_default(sock)
始终返回错误代码 -7 ("Invalid input data or parameter")
或 -12 ("Object not found")
,具体取决于我如何创建要发送的消息。
我在
nl_recvmsgs_default(sock)
之前检查了所有方法调用,但没有收到任何错误。我想我在设置消息或套接字时缺少一部分,但不知道它是什么。
#include <stdlib.h>
#include <unistd.h>
#include <linux/taskstats.h>
#include <netlink/netlink.h>
#include <netlink/genl/genl.h>
#include <netlink/genl/ctrl.h>
int callback_message(struct nl_msg *, void *);
int main(int argc, char ** argv) {
struct nl_sock * sock;
struct nl_msg * msg;
int family;
sock = nl_socket_alloc();
// Connect to generic netlink socket on kernel side
genl_connect(sock);
// get the id for the TASKSTATS generic family
family = genl_ctrl_resolve(sock, "TASKSTATS");
// Allocate a new netlink message and inherit netlink message header.
msg = nlmsg_alloc();
genlmsg_put(msg, NL_AUTO_PID, NL_AUTO_SEQ, family, 0, 0, TASKSTATS_CMD_GET, TASKSTATS_VERSION))
//error code: -7 NLE_INVAL "Invalid input data or parameter",
nla_put_string(msg, TASKSTATS_CMD_ATTR_REGISTER_CPUMASK, "0");
//error code: -12 NLE_OBJ_NOTFOUND "Obj not found"
//nla_put_string(msg, TASKSTATS_CMD_ATTR_PID, "583");
nl_send_auto(sock, msg);
nlmsg_free(msg);
// specify a callback for inbound messages
nl_socket_modify_cb(sock, NL_CB_MSG_IN, NL_CB_CUSTOM, callback_message, NULL);
// gives error code -7 or -12 depending on the two nla_put_string alternatives above
printf("recv code (0 = success): %d", nl_recvmsgs_default(sock));
}
int callback_message(struct nl_msg * nlmsg, void * arg) {
struct nlmsghdr * nlhdr;
struct nlattr * nlattrs[TASKSTATS_TYPE_MAX + 1];
struct nlattr * nlattr;
struct taskstats * stats;
int rem;
nlhdr = nlmsg_hdr(nlmsg);
int answer;
if ((answer = genlmsg_parse(nlhdr, 0, nlattrs, TASKSTATS_TYPE_MAX, NULL))
< 0) {
printf("error parsing msg\n");
}
if ((nlattr = nlattrs[TASKSTATS_TYPE_AGGR_TGID]) || (nlattr =
nlattrs[TASKSTATS_TYPE_AGGR_PID]) || (nlattr =
nlattrs[TASKSTATS_TYPE_NULL])) {
stats = nla_data(nla_next(nla_data(nlattr), &rem));
printf("---\n");
printf("pid: %u\n", stats->ac_pid);
printf("command: %s\n", stats->ac_comm);
printf("status: %u\n", stats->ac_exitcode);
printf("time:\n");
printf(" start: %u\n", stats->ac_btime);
printf(" elapsed: %llu\n", stats->ac_etime);
printf(" user: %llu\n", stats->ac_utime);
printf(" system: %llu\n", stats->ac_stime);
printf("memory:\n");
printf(" bytetime:\n");
printf(" rss: %llu\n", stats->coremem);
printf(" vsz: %llu\n", stats->virtmem);
printf(" peak:\n");
printf(" rss: %llu\n", stats->hiwater_rss);
printf(" vsz: %llu\n", stats->hiwater_vm);
printf("io:\n");
printf(" bytes:\n");
printf(" read: %llu\n", stats->read_char);
printf(" write: %llu\n", stats->write_char);
printf(" syscalls:\n");
printf(" read: %llu\n", stats->read_syscalls);
printf(" write: %llu\n", stats->write_syscalls);
} else {
printf("unknown attribute format received\n");
}
return 0;
}
您提供的代码对我来说工作正常,除了第 26 行中的语法错误。确保您以 root 身份运行该程序。请注意,您正在为退出任务创建侦听器,但正在读取一条消息,据我所知,这是一条 ACK。每当任务在 CPU 0 上退出时,在 while(1) 循环中从套接字读取就会显示已解析的消息。
编辑:如果您要获取单个 PID 的统计信息,您应该使用 nla_put_u32 代替:
nla_put_u32(msg, TASKSTATS_CMD_ATTR_PID, 583);
其中 583 是现有进程 ID。