1.环境介绍
- Linux内核版本 3.10.65 + ARM64 + SMP PREEMPT + aarch64
- 全志A133
- Source insight
- Vim
2.printk整体流程简介
printk是Linux内核向控制台输出log显示的函数,Linux内核先分配一个静态的临时缓冲区log_buf,然后调用vscnprintf格式化显示字符串,并返回格式化之后字符串的长度,最后调用tty驱动初始化注册绑定好的console.write()实现向串口输出。
3.printk定义
1 | asmlinkage int printk(const char *fmt, ...) |
4.printk调用流程分析
5.printk log如何输出到具体console?具体串口硬件设备注册流程分析
现以全志A133平台叙述:芯片平台一般有好几个串口,具体选择哪个串口作为printk输出串口是可以配置的。我是通过cmdline配置console=ttyS0,115200,然后传递给内核,内核在启动时会解析cmdline,根据cmdline的解析结果来选择使用哪个串口,并进行串口注册。
串口(UART)的注册,即硬件串口的准备。代码路径:
linux-3.10/drivers/tty/serial/sunxi-uart.c首先从设备树里加载UART的硬件资源,然后在uart模块init的时候,进行uart_register_driver()或者是platform_driver_register()。同时会调用xxx_console_setup()初始化串口模式,主要是初始化
struct console,sw_console_write()函数在con->write()的时候调用,就完成了往串口数据的打印。1
2
3
4
5
6
7
8
9
10
11static struct uart_driver sw_uart_driver;
static struct console sw_console = {
.name = "ttyS",
.write = sw_console_write,
.device = uart_console_device,
.setup = sw_console_setup,
.flags = CON_PRINTBUFFER,
//.flags = CON_BOOT,
.index = -1,
.data = &sw_uart_driver,
};printk.c里的register_console流程分析。代码路径:
linux-3.10/kernel/printk.c1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153void register_console(struct console *newcon)
{
int i;
unsigned long flags;
struct console *bcon = NULL;
/*
* before we register a new CON_BOOT console, make sure we don't
* already have a valid console
*/
if (console_drivers && newcon->flags & CON_BOOT) { //判断是否是引导控制台CON_BOOT
/* find the last or real console */
for_each_console(bcon) {
if (!(bcon->flags & CON_BOOT)) {
printk(KERN_INFO "Too late to register bootconsole %s%d\n",
newcon->name, newcon->index);
return;
}
}
}
if (console_drivers && console_drivers->flags & CON_BOOT)
bcon = console_drivers;
if (preferred_console < 0 || bcon || !console_drivers)
preferred_console = selected_console;
if (newcon->early_setup)
newcon->early_setup();
/*
* See if we want to use this console driver. If we
* didn't select a console we take the first one
* that registers here.
*/
if (preferred_console < 0) {
if (newcon->index < 0)
newcon->index = 0;
if (newcon->setup == NULL ||
newcon->setup(newcon, NULL) == 0) {
newcon->flags |= CON_ENABLED;
if (newcon->device) {
newcon->flags |= CON_CONSDEV;
preferred_console = 0;
}
}
}
/*
* See if this console matches one we selected on
* the command line.
*/
for (i = 0; i < MAX_CMDLINECONSOLES && console_cmdline[i].name[0];
i++) {
if (strcmp(console_cmdline[i].name, newcon->name) != 0)
continue;
if (newcon->index >= 0 &&
newcon->index != console_cmdline[i].index)
continue;
if (newcon->index < 0)
newcon->index = console_cmdline[i].index;
if (console_cmdline[i].brl_options) {
newcon->flags |= CON_BRL;
braille_register_console(newcon,
console_cmdline[i].index,
console_cmdline[i].options,
console_cmdline[i].brl_options);
return;
}
if (newcon->setup &&
newcon->setup(newcon, console_cmdline[i].options) != 0)//调用到sunxi driver里的setup
break;
newcon->flags |= CON_ENABLED;
newcon->index = console_cmdline[i].index;
if (i == selected_console) {
newcon->flags |= CON_CONSDEV;
preferred_console = selected_console;
}
break;
}
if (!(newcon->flags & CON_ENABLED)) //不是console的driver直接返回
return;
/*
* If we have a bootconsole, and are switching to a real console,
* don't print everything out again, since when the boot console, and
* the real console are the same physical device, it's annoying to
* see the beginning boot messages twice
*/
if (bcon && ((newcon->flags & (CON_CONSDEV | CON_BOOT)) == CON_CONSDEV))
newcon->flags &= ~CON_PRINTBUFFER;
/*
* Put this console in the list - keep the
* preferred driver at the head of the list.
*/
console_lock();
if ((newcon->flags & CON_CONSDEV) || console_drivers == NULL) {
newcon->next = console_drivers;
console_drivers = newcon;
if (newcon->next)
newcon->next->flags &= ~CON_CONSDEV;
} else {
newcon->next = console_drivers->next;
console_drivers->next = newcon;
}
if (newcon->flags & CON_PRINTBUFFER) {
/*
* console_unlock(); will print out the buffered messages
* for us.
*/
raw_spin_lock_irqsave(&logbuf_lock, flags);
console_seq = syslog_seq;
console_idx = syslog_idx;
console_prev = syslog_prev;
raw_spin_unlock_irqrestore(&logbuf_lock, flags);
/*
* We're about to replay the log buffer. Only do this to the
* just-registered console to avoid excessive message spam to
* the already-registered consoles.
*/
exclusive_console = newcon;
}
console_unlock();
console_sysfs_notify();
/*
* By unregistering the bootconsoles after we enable the real console
* we get the "console xxx enabled" message on all the consoles -
* boot consoles, real consoles, etc - this is to ensure that end
* users know there might be something in the kernel's log buffer that
* went to the bootconsole (that they do not see on the real console)
*/
if (bcon &&
((newcon->flags & (CON_CONSDEV | CON_BOOT)) == CON_CONSDEV) &&
!keep_bootcon) {
/* we need to iterate through twice, to make sure we print
* everything out, before we unregister the console(s)
*/
printk(KERN_INFO "console [%s%d] enabled, bootconsole disabled\n",
newcon->name, newcon->index);
for_each_console(bcon)
if (bcon->flags & CON_BOOT)
unregister_console(bcon);
} else {
printk(KERN_INFO "%sconsole [%s%d] enabled\n",
(newcon->flags & CON_BOOT) ? "boot" : "" ,
newcon->name, newcon->index);
}
}register_console()首先遍历全局的console_drivers(**决定哪个driver输出printk log)*,然后根据早期硬件串口驱动注册时的struct console*的成员flags是否为
CON_BOOT(引导控制台)。接下来获取preferred_console,从preferred_console = selected_console可以获取到串口号。然后接下来使用for循环匹配cmdline里的console name与注册时的newcon->name字段,最后把串口号放到console_drivers列表里,截止现在就完成了tty端口的指定了。1
2struct console *console_drivers;
EXPORT_SYMBOL_GPL(console_drivers);
6.prink注意点
printk 分2个阶段, 第一个阶段是将log放到buff, 第二阶段是将buff数据送到uart。Dump log到uart console是一个同步的操作过程。在此期间,IRQ是被disabled,CPU也是在wait,一直到dump完成。由于此阶段的速度会很慢, 并且会把local的irq disable. 所以会导致local cpu sched, 如果在此cpu 的thread都会被block,严重的可能导致系统block。接下来分析一下printk一些可能与异常有关的函数
zap_locks
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16static void zap_locks(void)
{
static unsigned long oops_timestamp;
if (time_after_eq(jiffies, oops_timestamp) &&
!time_after(jiffies, oops_timestamp + 30 * HZ))
return;
oops_timestamp = jiffies;
debug_locks_off();
/* If a crash is occurring, make sure we can't deadlock */
raw_spin_lock_init(&logbuf_lock);
/* And make sure that we print immediately */
sema_init(&console_sem, 1);
}zap_locks()函数主要作用就是在prink运行期间,如果printk local CPU发生了crash,确保不能死锁,并且每隔10s初始化锁
logbuf_lock和console_sem,留时间给控制台打印完全的oops信息。但我在实际压力测试驱动过程发现,机器一直异常重启(必现),并且很多时候kernel reboot异常的堆栈信息不见了,具体异常情况稍后介绍和分析。console_trylock_for_printk
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27static int console_trylock_for_printk(unsigned int cpu)
__releases(&logbuf_lock)
{
int retval = 0, wake = 0;
if (console_trylock()) {
retval = 1;
/*
* If we can't use the console, we need to release
* the console semaphore by hand to avoid flushing
* the buffer. We need to hold the console semaphore
* in order to do this test safely.
*/
if (!can_use_console(cpu)) {
console_locked = 0;
wake = 1;
retval = 0;
}
}
logbuf_cpu = UINT_MAX;
raw_spin_unlock(&logbuf_lock);
if (wake)
up(&console_sem);
return retval;
}要想对console操作,首先需要console_trylock()获取锁,需要注意的是不管是否成功获取console_sem,都需要保证logbuf_lock的锁(spin lock)释放掉,需要注意的是此时仍然禁止中断!!!
7.printk异常情景
最近整合一个工程,但压力测试时发现机器总是莫名其妙重启(接口调用非常频繁,并且printk的优先级都是*KERN_ERR*),并且偶尔能看到异常重启时的堆栈信息,大多时候没有异常堆栈信息,即使有,log也是残缺的。从偶现的log关键字:WARNING: at kernel/watchdog.c:283 watchdog_timer_fn+0xf8/0x2f4(),Watchdog detected hard LOCKUP on cpu 1,[sched_delayed] sched: RT throttling activated,分析可能是由于watchdog检测到了CPU上长时间关闭中断(机制就是watchdog会检查hrtimer值,如果在watchdog_thresh时间内hrtimer值没有递增了,那就认为此时发生了死锁,具体是硬件死锁还是软件死锁,后续单开一篇文章来写),然后就由watchdog接管,看此时是重启还是其他操作(可配置),并且在kernel也可以配置BOOTPARAM_HARDLOCKUP_PANIC_VALUE决定发生hard lockup时是否panic,可很奇怪的是我确认BOOTPARAM_HARDLOCKUP_PANIC_VALUE配置为1,但实际发生hard lockup时仍然没有异常堆栈信息,这点有大神可以指点一下感激不尽。
8.printk使用建议
printk在生产环境中尽量少用,并且在做性能测试的时候少用printk测试,尽量在程序出现不符合预期的行为时打印log,在程序运行缓慢或者block的时候可以检查一下代码,是否有频繁调用的接口里大量使用printk。