Linux中断管理 (1)Linux中断管理机制

时间:2023-03-09 00:42:33
Linux中断管理 (1)Linux中断管理机制

目录:

Linux中断管理

Linux中断管理 (1)Linux中断管理机制

Linux中断管理 (2)软中断和tasklet

Linux中断管理 (3)workqueue工作队列

关键词:GIC、IAR、EOI、SGI/PPI/SPI、中断映射、中断异常向量、中断上下文、内核中断线程、中断注册。

由于篇幅较大,简单梳理一下内容。

本章主要可以分为三大部分:

讲解硬件背景的1. ARM中断控制器

系统初始化的静态过程:GIC初始化和各中断的中断号映射2. 硬件中断号和Linux中断号的映射;每个中断的注册5. 注册中断

一个中断从产生到执行完毕的动态过程:ARM底层通用部分如何处理3. ARM底层中断处理;GIC部分的处理流程以及上层通用处理部分4. 高层中断处理

这里的高层处理,没有包括下半部。下半部在Linux中断管理 (2)软中断和taskletLinux中断管理 (3)workqueue工作队列中进行介绍。

1. ARM中断控制器

1.1 ARM支持中断类型

ARM GIC-v2支持三种类型的中断:

SGI:软件触发中断(Software Generated Interrupt),通常用于多核间通讯,最多支持16个SGI中断,硬件中断号从ID0~ID15。SGI通常在Linux内核中被用作IPI中断(inter-processor interrupts),并会送达到系统指定的CPU上。

PPI:私有外设中断(Private Peripheral Interrupt),是每个CPU私有的中断。最多支持16个PPI中断,硬件中断号从ID16~ID31。PPI通常会送达到指定的CPU上,应用场景有CPU本地时钟。

SPI:公用外设中断(Shared Peripheral Interrupt),最多可以支持988个外设中断,硬件中断号从ID32~ID1019。

1.2 GIC检测中断流程

GIC主要由两部分组成,分别是仲裁单元(Distributor)和CPU接口模块。

GIC仲裁单元为每一个中断维护一个状态机,分别是:inactive、pending、active and pending、active。

下面是来自IHI0048B GIC-V2规格书3.2.4 Interrupt handling state machine截图:

Linux中断管理 (1)Linux中断管理机制

GIC检测中断流程如下:

(1) 当GIC检测到一个中断发生时,会将该中断标记为pending状态(A1)。

(2) 对处于pending状态的中断,仲裁单元回确定目标CPU,将中断请求发送到这个CPU上。

(3) 对于每个CPU,仲裁单元会从众多pending状态的中断中选择一个优先级最高的中断,发送到目标CPU的CPU Interface模块上。

(4) CPU Interface会决定这个中断是否可以发送给CPU。如果该终端优先级满足要求,GIC会发生一个中断信号给该CPU。

(5) 当一个CPU进入中断异常后,会去读取GICC_IAR寄存器来响应该中断(一般是Linux内核的中断处理程序来读寄存器)。寄存器会返回硬件中断号(hardware interrupt ID),对于SGI中断来说是返回源CPU的ID。

当GIC感知到软件读取了该寄存器后,又分为如下情况:

* 如果该中断源是pending状态,那么转改将变成active。(C) 

* 如果该中断又重新产生,那么pending状态变成active and pending。(D)

* 如果该中断是active状态,现在变成active and pending。(A2)

(6) 当处理器完成中断服务,必须发送一个完成信号EOI(End Of Interrupt)给GIC控制器。软件写GICC_EOIR寄存器,状态变成inactive。(E1)

补充:

(7) 对于level triggered类型中断来说,当触发电平消失,状态从active and pending变成active。(B2)

常用路径是A1->D->B2->E1。

1.2.1 GIC中断抢占

GIC中断控制器支持中断优先级抢占,一个高优先级中断可以抢占一个低优先级且处于active状态的中断,即GIC仲裁单元会记录和比较当前优先级最高的pending状态,然后去抢占当前中断,并且发送这个最高优先级的中断请求给CPU,CPU应答了高优先级中断,暂停低优先级中断服务,进而去处理高优先级中断。

GIC会将pending状态优先级最高的中断请求发送给CPU。

1.2.2 Linux对中断抢占处理

从GIC角度看,GIC会发送高优先级中断请求给CPU。

但是目前CPU处于关中断状态,需要等低优先级中断处理完毕,直到发送EOI给GIC。

然后CPU才会响应pending状态中优先级最高的中断进行处理。

所以Linux下:

1. 高优先级中断无法抢占正在执行的低优先级中断。

2.同处于pending状态的中断,优先响应高优先级中断进行处理。

1.3 GIC中断时序

Linux中断管理 (1)Linux中断管理机制

借助GIC-400 Figure B-2 Signaling physical interrupts理解GIC内部工作原理。

M和N都是SPI类型的外设中断,且通过FIQ来处理,高电平触发,N的优先级比M高,他们的目标CPU相同。

(1) T1时刻:GIC的总裁单元检测到中断M的电平变化。

(2) T2时刻:仲裁单元设置中断M的状态为pending。

(3) T17时刻:CPU Interface模块会拉低nFIQCPU[n]信号。在中断M的状态变成pending后,大概需要15个时钟周期后会拉低nFIQCPU[n]信号来向CPU报告中断请求(assertion)。仲裁单元需要这些时间来计算哪个是pending状态下优先级最高的中断。

(4) T42时刻:仲裁单元检测到另外一个优先级更高的中断N。

(5) T43时刻:仲裁单元用中断N替换中断M为当前pending状态下优先级最高的中断,并设置中断N为pending状态。

(6) T58时刻:经过tph个时钟后,CPU Interface拉低你FIOCPU[n]信号来通知CPU。因为此信号在T17时刻已经被拉低,CPU Interface模块会更新GICC_IAR寄存器的Interrupt ID域,该域的值变成中断N的硬件中断号。

(7) T61~T131时刻:Linux对中断N的服务程序--------------------------------------------------------------中断服务程序处理段,从GICC_IAR开始到GICC_EOIR结束。

  T61时刻:CPU(Linux中断服务例程)读取GICC_IAR寄存器,即软件响应了中断N。这时仲裁单元把中断N的状态从pending变成active and pending。读取GICC_IAR

  T64时刻:在中断N被Linux相应3个时钟内,CPU Interface模块完成对nFIQCPU[n]信号的deasserts,即拉高nFIQCPU[n]信号。

  T126时刻:外设也deassert了该中断N。

  T128时刻:仲裁单元移出了中断N的pending状态。

  T131时刻:Linux服务程序把中断N的硬件ID号写入GICC_EOIR寄存器来完成中断N的全部处理过程。写GICC_EOIR

(8) T146时刻:在向GICC_EOIR寄存器写入中断N中断号后的tph个时钟后,仲裁单元会选择下一个最高优先级中断,即中断M,发送中断请求给CPU Interface模块。CPU Interface会拉低nFIQCPU[n]信号来向CPU报告外设M的中断请求。

(9) T211时刻:Linux中断服务程序读取GICC_IAR寄存器来响应中断,仲裁单元设置中断M的状态为active and pending。

(10) T214时刻:在CPU响应中断后的3个时钟内,CPU Interface模块拉高nFIOCPU[n]信号来完成deassert动作。

那么GICC_IAR和GICC_EOIR分别在Linux什么地方触发的呢?

1.4 Cortex A15 A7实例

2. 硬件中断号和Linux中断号的映射

2.1 硬件中断号:一个串口中断实例

2.2 中断控制器初始化

DTS中GIC定义于arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts:

    gic: interrupt-controller@2c001000 {
compatible = "arm,cortex-a15-gic", "arm,cortex-a9-gic";------------------此设备的标识符是"arm,cortex-a15-gic"
#interrupt-cells = <>;
#address-cells = <>;
interrupt-controller;----------------------------------------------------表示此设备是一个中断控制器
reg = < 0x2c001000 0x1000>,
< 0x2c002000 0x1000>,
< 0x2c004000 0x2000>,
< 0x2c006000 0x2000>;
interrupts = < 0xf04>;
};

struct irq_domain用于描述一个中断控制器。

GIC中断控制器在初始化时解析DTS信息中定义了几个GIC控制器,每个GIC控制器注册一个struct irq_domain数据结构。

struct irq_domain {
struct list_head link;-------------------------用于将irq_domain连接到全局链表irq_domain_list中。
const char *name;------------------------------中断控制器名称
const struct irq_domain_ops *ops;--------------irq domain映射操作使用的方法集合
void *host_data;
unsigned int flags; /* Optional data */
struct device_node *of_node;------------------对应中断控制器的device node
struct irq_domain_chip_generic *gc;
#ifdef CONFIG_IRQ_DOMAIN_HIERARCHY
struct irq_domain *parent;
#endif /* reverse map data. The linear map gets appended to the irq_domain */
irq_hw_number_t hwirq_max;--------------------该irq domain支持中断数量的最大值。
unsigned int revmap_direct_max_irq;
unsigned int revmap_size;---------------------线性映射的大小
struct radix_tree_root revmap_tree;-----------Radix Tree映射的根节点
unsigned int linear_revmap[];-----------------线性映射用到的lookup table
}

struct irq_domain_ops定义了irq_domain方法集合,xlate从intspec中解析出硬件中断号和中断类型,intspec[0]和intspec[1]决定中断号,intspec[2]决定中断类型。

struct irq_domain_ops {
int (*match)(struct irq_domain *d, struct device_node *node);
int (*map)(struct irq_domain *d, unsigned int virq, irq_hw_number_t hw);
void (*unmap)(struct irq_domain *d, unsigned int virq);
int (*xlate)(struct irq_domain *d, struct device_node *node,
const u32 *intspec, unsigned int intsize,
unsigned long *out_hwirq, unsigned int *out_type); #ifdef CONFIG_IRQ_DOMAIN_HIERARCHY
/* extended V2 interfaces to support hierarchy irq_domains */
int (*alloc)(struct irq_domain *d, unsigned int virq,
unsigned int nr_irqs, void *arg);
void (*free)(struct irq_domain *d, unsigned int virq,
unsigned int nr_irqs);
void (*activate)(struct irq_domain *d, struct irq_data *irq_data);
void (*deactivate)(struct irq_domain *d, struct irq_data *irq_data);
#endif
}; static const struct irq_domain_ops gic_irq_domain_hierarchy_ops = {
.xlate =gic_irq_domain_xlate,
.alloc =gic_irq_domain_alloc,
.free =irq_domain_free_irqs_top,
}; static int gic_irq_domain_xlate(struct irq_domain *d,
struct device_node *controller,
const u32 *intspec, unsigned int intsize,
unsigned long *out_hwirq, unsigned int *out_type)
{
...
/* Get the interrupt number and add 16 to skip over SGIs */
*out_hwirq = intspec[1] + 16;--------------------------------------首先+16跳过SGI类型中断 /* For SPIs, we need to add 16 more to get the GIC irq ID number */
if (!intspec[0]) {-------------------------------------------------如果是SPI类型中断,还需要+16,跳过PPI类型中断。
ret = gic_routable_irq_domain_ops->xlate(d, controller,
intspec,
intsize,
out_hwirq,
out_type); if (IS_ERR_VALUE(ret))
return ret;
} *out_type = intspec[2] & IRQ_TYPE_SENSE_MASK;---------------------中断触发类型,包括四种上升沿、下降沿、高电平、低电平。 return ret;
} static int gic_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
unsigned int nr_irqs, void *arg)
{
int i, ret;
irq_hw_number_t hwirq;
unsigned int type = IRQ_TYPE_NONE;
struct of_phandle_args *irq_data = arg; ret = gic_irq_domain_xlate(domain, irq_data->np, irq_data->args,
irq_data->args_count, &hwirq, &type);---------------首先根据args翻译出硬件中断号和中断类型。
if (ret)
return ret; for (i = 0; i < nr_irqs; i++)
gic_irq_domain_map(domain, virq + i, hwirq + i);---------------执行软硬件的映射,并且根据中断类型设置struct irq_desc->handle_irq处理函数。 return 0;
} void irq_domain_free_irqs_top(struct irq_domain *domain, unsigned int virq,
unsigned int nr_irqs)
{
int i; for (i = 0; i < nr_irqs; i++) {
irq_set_handler_data(virq + i, NULL);
irq_set_handler(virq + i, NULL);
}
irq_domain_free_irqs_common(domain, virq, nr_irqs);
}

针对SPI类型中断,需要进行+16位移。

static int gic_routable_irq_domain_xlate(struct irq_domain *d,
struct device_node *controller,
const u32 *intspec, unsigned int intsize,
unsigned long *out_hwirq,
unsigned int *out_type)
{
*out_hwirq += ;
return ;
}

gic_irq_domain_map()入参有struct irq_domain和软硬件中断号,主要分SGI/PPI一组,SPI一组。

主要工作由irq_domain_set_info()处理,irq_domain_set_hwirq_and_chip()通过Linux中断号获取struct irq_data数据结构,设置关联硬件中断号和struct irq_chip gic_chip关联。

__irq_set_handler()设置中断描述符irq_desc->handler_irq回调函数,对SPI类型来说就是handle_fasteoi_irq()。

static int gic_irq_domain_map(struct irq_domain *d, unsigned int irq,
irq_hw_number_t hw)
{
if (hw < ) {
irq_set_percpu_devid(irq);-------------------------------PerCPU类型的中断有自己的特殊flag。
irq_domain_set_info(d, irq, hw, &gic_chip, d->host_data,
handle_percpu_devid_irq, NULL, NULL);
set_irq_flags(irq, IRQF_VALID | IRQF_NOAUTOEN);
} else {
irq_domain_set_info(d, irq, hw, &gic_chip, d->host_data,
handle_fasteoi_irq, NULL, NULL);
set_irq_flags(irq, IRQF_VALID | IRQF_PROBE); gic_routable_irq_domain_ops->map(d, irq, hw);
}
return ;
} void irq_domain_set_info(struct irq_domain *domain, unsigned int virq,
irq_hw_number_t hwirq, struct irq_chip *chip,
void *chip_data, irq_flow_handler_t handler,
void *handler_data, const char *handler_name)
{
irq_domain_set_hwirq_and_chip(domain, virq, hwirq, chip, chip_data);
__irq_set_handler(virq, handler, , handler_name);
irq_set_handler_data(virq, handler_data);
} int irq_domain_set_hwirq_and_chip(struct irq_domain *domain, unsigned int virq,
irq_hw_number_t hwirq, struct irq_chip *chip,
void *chip_data)
{
struct irq_data *irq_data = irq_domain_get_irq_data(domain, virq); if (!irq_data)
return -ENOENT; irq_data->hwirq = hwirq;
irq_data->chip = chip ? chip : &no_irq_chip;
irq_data->chip_data = chip_data; return ;
} void
__irq_set_handler(unsigned int irq, irq_flow_handler_t handle, int is_chained,
const char *name)
{
unsigned long flags;
struct irq_desc *desc = irq_get_desc_buslock(irq, &flags, );
...
desc->handle_irq = handle;--------------------irq_desc->handler_irq和name赋值。
desc->name = name;
...
}

drivers/irqchip/irq-gic.c定义了"arm,cortex-a15-gic"的处理函数gic_of_init,gic_of_init是GIC控制器的初始化函数。

IRQCHIP_DECLARE(cortex_a15_gic, "arm,cortex-a15-gic", gic_of_init);

static int gic_cnt __initdata;

static int __init
gic_of_init(struct device_node *node, struct device_node *parent)
{
...
gic_init_bases(gic_cnt, -, dist_base, cpu_base, percpu_offset, node);
if (!gic_cnt)
gic_init_physaddr(node); if (parent) {
irq = irq_of_parse_and_map(node, );
gic_cascade_irq(gic_cnt, irq);
} if (IS_ENABLED(CONFIG_ARM_GIC_V2M))
gicv2m_of_init(node, gic_data[gic_cnt].domain); gic_cnt++;
return ;
}

gic_init_bases的gic_nr是GIC控制器的序号,主要调用irq_domain_add_linear()分配并函数注册一个irq_domain。

void __init gic_init_bases(unsigned int gic_nr, int irq_start,
void __iomem *dist_base, void __iomem *cpu_base,
u32 percpu_offset, struct device_node *node)
{
irq_hw_number_t hwirq_base;
struct gic_chip_data *gic;
int gic_irqs, irq_base, i;
int nr_routable_irqs; BUG_ON(gic_nr >= MAX_GIC_NR);---------------------------gic_nr不超过系统规定的MAX_GIC_NR gic = &gic_data[gic_nr];--------------------------------struct gic_chip_data类型的全局变量gic_data,序号是GIC控制器序号
...
/*
* Initialize the CPU interface map to all CPUs.
* It will be refined as each CPU probes its ID.
*/
for (i = ; i < NR_GIC_CPU_IF; i++)
gic_cpu_map[i] = 0xff; /*
* Find out how many interrupts are supported.
* The GIC only supports up to 1020 interrupt sources.
*/
gic_irqs = readl_relaxed(gic_data_dist_base(gic) + GIC_DIST_CTR) & 0x1f;------------计算GIC控制器最多支持的中断源个数
gic_irqs = (gic_irqs + ) * ;
if (gic_irqs > )----------------------------------------------------------------GIC支持的最大中断数据,此处为1020
gic_irqs = ;
gic->gic_irqs = gic_irqs; if (node) { /* DT case */
const struct irq_domain_ops *ops = &gic_irq_domain_hierarchy_ops;--------------GICv2的struct irq_domain_ops
...
gic->domain = irq_domain_add_linear(node, gic_irqs, ops, gic);-----------------注册irq_domain,操作函数使用gic_irq_domain_hierarchy_ops
} else { /* Non-DT case */
...
} if (WARN_ON(!gic->domain))
return; if (gic_nr == ) {
#ifdef CONFIG_SMP
set_smp_cross_call(gic_raise_softirq);
register_cpu_notifier(&gic_cpu_notifier);
#endifset_handle_irq(gic_handle_irq);-------在irq_handler中调用handle_arch_irq,这里将handle_arch_irq指向gic_handle_irq,实现了平台中断和具体GIC中断的关联。
} gic_chip.flags |= gic_arch_extn.flags;
gic_dist_init(gic);----------------------GIC Distributer部分初始化
gic_cpu_init(gic);-----------------------GIC CPU Interface部分初始化
gic_pm_init(gic);------------------------GIC PM相关初始化
}

irq_domain_add_linear()->__irq_domain_add()分配并初始化struct irq_domain。

struct irq_domain *__irq_domain_add(struct device_node *of_node, int size,
irq_hw_number_t hwirq_max, int direct_max,
const struct irq_domain_ops *ops,
void *host_data)
{
struct irq_domain *domain; domain = kzalloc_node(sizeof(*domain) + (sizeof(unsigned int) * size),
GFP_KERNEL, of_node_to_nid(of_node));-------------domain大小为struct irq_domain加上gic_irqs个unsigned int。
if (WARN_ON(!domain))
return NULL; /* Fill structure */
INIT_RADIX_TREE(&domain->revmap_tree, GFP_KERNEL);
domain->ops = ops;
domain->host_data = host_data;
domain->of_node = of_node_get(of_node);
domain->hwirq_max = hwirq_max;
domain->revmap_size = size;
domain->revmap_direct_max_irq = direct_max;
irq_domain_check_hierarchy(domain); mutex_lock(&irq_domain_mutex);
list_add(&domain->link, &irq_domain_list);----------------------将创建好的struct irq_domain加入全局链表irq_domain_list。
mutex_unlock(&irq_domain_mutex); pr_debug("Added domain %s\n", domain->name);
return domain;
}

2.3 系统初始化之中断号映射

上一小节是中断控制器GIC的初始化,下面看看一个硬件中断是如何映射到Linux空间的中断的。

customize_machine()是arch_initcall阶段调用,很靠前。

customize_machine

->of_platform_populate

->of_platform_bus_create

->of_amba_device_create

->of_amba_device_create

下面结合dtsi文件看看来龙去脉,arch/arm/boot/dts/vexpress-v2m.dtsi。

/dts-v1/;

/ {
model = "V2P-CA9";
arm,hbi = <0x191>;
arm,vexpress,site = <0xf>;
compatible = "arm,vexpress,v2p-ca9", "arm,vexpress";
interrupt-parent = <&gic>;
#address-cells = <>;
#size-cells = <>;
...
gic: interrupt-controller@1e001000 {
compatible = "arm,cortex-a9-gic";
#interrupt-cells = <>;
#address-cells = <>;
interrupt-controller;
reg = <0x1e001000 0x1000>,
<0x1e000100 0x100>;
};
...
smb {
compatible = "simple-bus"; #address-cells = <>;
#size-cells = <>;
ranges = < 0x40000000 0x04000000>,
< 0x44000000 0x04000000>,
< 0x48000000 0x04000000>,
< 0x4c000000 0x04000000>,
< 0x10000000 0x00020000>; #interrupt-cells = <>;
interrupt-map-mask = < >;
interrupt-map = < &gic >,
< &gic >,
...
/include/ "vexpress-v2m.dtsi"
};
}

vexpress-v2m.dtsi文件:
motherboard {
model = "V2M-P1";
arm,hbi = <0x190>;
arm,vexpress,site = <>;
compatible = "arm,vexpress,v2m-p1", "simple-bus";
#address-cells = <>; /* SMB chipselect number and offset */
#size-cells = <>;
#interrupt-cells = <>;
ranges;
...
iofpga@7,00000000 {
compatible = "arm,amba-bus", "simple-bus";
#address-cells = <>;
#size-cells = <>;
ranges = < 0x20000>;
...
v2m_serial0: uart@ {
compatible = "arm,pl011", "arm,primecell";
reg = <0x09000 0x1000>;
interrupts = <>;
clocks = <&v2m_oscclk2>, <&smbclk>;
clock-names = "uartclk", "apb_pclk";
};
...
};
}

这里首先从根目录下查找"simple-bus",从上面可以看出指向smb设备。

smb设备包含vexpress-v2m.dtsi文件,然后在of_platform_bus_create()中遍历所有设备。

const struct of_device_id of_default_bus_match_table[] = {
{ .compatible = "simple-bus", },
#ifdef CONFIG_ARM_AMBA
{ .compatible = "arm,amba-bus", },
#endif /* CONFIG_ARM_AMBA */
{} /* Empty terminated list */
}; static int __init customize_machine(void)
{
...
of_platform_populate(NULL, of_default_bus_match_table,-----------------找到匹配"simple-bus"的设备,这里指向smb。
NULL, NULL);
...
} int of_platform_populate(struct device_node *root,
const struct of_device_id *matches,
const struct of_dev_auxdata *lookup,
struct device *parent)
{
...
for_each_child_of_node(root, child) {
rc = of_platform_bus_create(child, matches, lookup, parent, true);-----这里的root指向根目录,即"/"。
if (rc)
break;
}
...
} static int of_platform_bus_create(struct device_node *bus,
const struct of_device_id *matches,
const struct of_dev_auxdata *lookup,
struct device *parent, bool strict)
{
const struct of_dev_auxdata *auxdata;
struct device_node *child;
struct platform_device *dev;
const char *bus_id = NULL;
void *platform_data = NULL;
int rc = ; /* Make sure it has a compatible property */
if (strict && (!of_get_property(bus, "compatible", NULL))) {
pr_debug("%s() - skipping %s, no compatible prop\n",
__func__, bus->full_name);
return ;
} auxdata = of_dev_lookup(lookup, bus);
if (auxdata) {
bus_id = auxdata->name;
platform_data = auxdata->platform_data;
} if (of_device_is_compatible(bus, "arm,primecell")) {------当遇到匹配"arm,primecell"设备,创建amba设备。在ofpga@7,00000000中创建uart@09000设备。
/*
* Don't return an error here to keep compatibility with older
* device tree files.
*/of_amba_device_create(bus, bus_id, platform_data, parent);
return ;
} dev = of_platform_device_create_pdata(bus, bus_id, platform_data, parent);
if (!dev || !of_match_node(matches, bus))
return ; for_each_child_of_node(bus, child) {----------------遍历smb下的所有"simple-bus"设备,这里可以嵌套几层。从smb->motherboard->iofpga@7,00000000。
pr_debug(" create child: %s\n", child->full_name);
rc = of_platform_bus_create(child, matches, lookup, &dev->dev, strict);
if (rc) {
of_node_put(child);
break;
}
}
of_node_set_flag(bus, OF_POPULATED_BUS);
return rc;
}

of_amba_device_create创建ARM AMBA类型设备,其中中断部分交给irq_of_parse_and_map()处理。

static struct amba_device *of_amba_device_create(struct device_node *node,
const char *bus_id,
void *platform_data,
struct device *parent)
{
...
/* Decode the IRQs and address ranges */
for (i = ; i < AMBA_NR_IRQS; i++)
dev->irq[i] =irq_of_parse_and_map(node, i);
...
}

以uart@09000为例,irq_of_parse_and_map中的of_irq_parse_one()解析设备中的"interrupts"、"regs"等参数,参数放入struct of_phandle_args中,oirq->args[1]中存放中断号5,oirq->np存放struct device_node。

irq_create_of_mapping()建立硬件中断号到Linux中断号的映射。

irq_create_of_mapping主要调用如下,主要工作交给__irq_domain_alloc_irqs()进行处理。

irq_create_of_mapping

->domain->ops->xlate---------------------------------

->irq_find_mapping

->irq_domain_alloc_irqs

->__irq_domain_alloc_irqs

->irq_domain_alloc_descs

->irq_domain_alloc_irq_data

->irq_domain_alloc_irqs_recursive

->gic_irq_domain_alloc

->gic_irq_domain_map-----------------------进行硬件中断号和软件中断号的映射

->gic_irq_domain_set_info----------------设置重要参数到中断描述符中

->irq_domain_insert_irq

unsigned int irq_of_parse_and_map(struct device_node *dev, int index)
{
struct of_phandle_args oirq; if (of_irq_parse_one(dev, index, &oirq))
return ; return irq_create_of_mapping(&oirq);
} unsigned int irq_create_of_mapping(struct of_phandle_args *irq_data)
{
struct irq_domain *domain;
irq_hw_number_t hwirq;
unsigned int type = IRQ_TYPE_NONE;
int virq; domain = irq_data->np ? irq_find_host(irq_data->np) : irq_default_domain;---找到设备所属的struct irq_domain结构体。
...
/* If domain has no translation, then we assume interrupt line */
if (domain->ops->xlate == NULL)
hwirq = irq_data->args[];
else {
if (domain->ops->xlate(domain, irq_data->np, irq_data->args,-------调用gic_irq_domain_xlate()函数进行硬件中断号到Linux中断号的转换。
irq_data->args_count, &hwirq, &type))
return ;
} if (irq_domain_is_hierarchy(domain)) {-------------------------可以分层挂载
/*
* If we've already configured this interrupt,
* don't do it again, or hell will break loose.
*/
virq =irq_find_mapping(domain, hwirq);-------------------从已有的linear_revmap中寻找Linux中断号。
if (virq)
return virq; virq = irq_domain_alloc_irqs(domain, , NUMA_NO_NODE, irq_data);---------如果没有找到,重新分配中断映射。参数1表示每次只分配一个中断。
if (virq <= )
return ;
} else {
...
} /* Set type if specified and different than the current one */
if (type != IRQ_TYPE_NONE &&
type != irq_get_trigger_type(virq))
irq_set_irq_type(virq, type);-----------------------------设置中断触发类型
return virq;
}

struct irq_desc定义了中断描述符,irq_desc[]数组定义了NR_IRQS个中断描述符,数组下标表示IRQ中断号,通过IRQ中断号可以找到对应中断描述符。

struct irq_desc内置了struct irq_data结构体,struct irq_data的irq和hwirq分别对应软件中断号和硬件中断号。通过这两个成员,可以将硬件中断号和软件中断号映射起来。

struct irq_chip定义了中断控制器底层操作相关的方法集合。

struct irq_desc {
struct irq_data irq_data;
unsigned int __percpu *kstat_irqs;
irq_flow_handler_t handle_irq;-----------------根据中断号分类,不同类型中断的处理handle。0~31对应handle_percpu_devid_irq;32~对应handle_fasteoi_irq。
#ifdef CONFIG_IRQ_PREFLOW_FASTEOI
irq_preflow_handler_t preflow_handler;
#endif
struct irqaction *action; /* IRQ action list */
unsigned int status_use_accessors;
unsigned int core_internal_state__do_not_mess_with_it;
unsigned int depth; /* nested irq disables */
unsigned int wake_depth; /* nested wake enables */
unsigned int irq_count; /* For detecting broken IRQs */
unsigned long last_unhandled; /* Aging timer for unhandled count */
unsigned int irqs_unhandled;
atomic_t threads_handled;
int threads_handled_last;
raw_spinlock_t lock;
struct cpumask *percpu_enabled;
#ifdef CONFIG_SMP
const struct cpumask *affinity_hint;
struct irq_affinity_notify *affinity_notify;
#ifdef CONFIG_GENERIC_PENDING_IRQ
cpumask_var_t pending_mask;
#endif
#endif
unsigned long threads_oneshot;-------------是一个位图,每个比特位代表正在处理的共享oneshot类型中断的中断线程。
atomic_t threads_active;-------------------表示正在运行的中断线程个数
wait_queue_head_t wait_for_threads;
#ifdef CONFIG_PM_SLEEP
unsigned int nr_actions;
unsigned int no_suspend_depth;
unsigned int cond_suspend_depth;
unsigned int force_resume_depth;
#endif
#ifdef CONFIG_PROC_FS
struct proc_dir_entry *dir;
#endif
int parent_irq;
struct module *owner;
const char *name;
} struct irq_data {
u32 mask;
unsigned int irq;-----------------Linux软件中断号
unsigned long hwirq;--------------硬件中断号
unsigned int node;
unsigned int state_use_accessors;
struct irq_chip *chip;
struct irq_domain *domain;
#ifdef CONFIG_IRQ_DOMAIN_HIERARCHY
struct irq_data *parent_data;
#endif
void *handler_data;
void *chip_data;
struct msi_desc *msi_desc;
cpumask_var_t affinity;
} struct irq_chip {
const char *name;
unsigned int (*irq_startup)(struct irq_data *data);-------------初始化中断
void (*irq_shutdown)(struct irq_data *data);----------------结束中断
void (*irq_enable)(struct irq_data *data);------------------使能中断
void (*irq_disable)(struct irq_data *data);-----------------关闭中断 void (*irq_ack)(struct irq_data *data);---------------------应答中断
void (*irq_mask)(struct irq_data *data);--------------------屏蔽中断
void (*irq_mask_ack)(struct irq_data *data);----------------应答并屏蔽中断
void (*irq_unmask)(struct irq_data *data);------------------解除中断屏蔽
void (*irq_eoi)(struct irq_data *data);---------------------发送EOI信号,表示硬件中断处理已经完成。 int (*irq_set_affinity)(struct irq_data *data, const struct cpumask *dest, bool force);--------绑定中断到某个CPU
int (*irq_retrigger)(struct irq_data *data);----------------重新发送中断到CPU
int (*irq_set_type)(struct irq_data *data, unsigned int flow_type);----------------------------设置触发类型
int (*irq_set_wake)(struct irq_data *data, unsigned int on);-----------------------------------使能/关闭中断在电源管理中的唤醒功能。 void (*irq_bus_lock)(struct irq_data *data);
void (*irq_bus_sync_unlock)(struct irq_data *data); void (*irq_cpu_online)(struct irq_data *data);
void (*irq_cpu_offline)(struct irq_data *data); void (*irq_suspend)(struct irq_data *data);
void (*irq_resume)(struct irq_data *data);
void (*irq_pm_shutdown)(struct irq_data *data);
...
unsigned long flags;
}

gic_chip是特定中断控制器的硬件操作函数集,对于GICv2有屏蔽/去屏蔽、EOI、设置中断触发类型、以及设置或者当前芯片状态。

static const struct irq_chip gic_chip = {
.irq_mask = gic_mask_irq,
.irq_unmask = gic_unmask_irq,
.irq_eoi = gic_eoi_irq,
.irq_set_type = gic_set_type,
.irq_get_irqchip_state = gic_irq_get_irqchip_state,
.irq_set_irqchip_state = gic_irq_set_irqchip_state,
.flags = IRQCHIP_SET_TYPE_MASKED |
IRQCHIP_SKIP_SET_WAKE |
IRQCHIP_MASK_ON_SUSPEND,
}; static void gic_mask_irq(struct irq_data *d)
{
gic_poke_irq(d, GIC_DIST_ENABLE_CLEAR);
} static void gic_unmask_irq(struct irq_data *d)
{
gic_poke_irq(d, GIC_DIST_ENABLE_SET);
} static void gic_eoi_irq(struct irq_data *d)
{
writel_relaxed(gic_irq(d), gic_cpu_base(d) + GIC_CPU_EOI);
} static int gic_set_type(struct irq_data *d, unsigned int type)
{
void __iomem *base = gic_dist_base(d);
unsigned int gicirq = gic_irq(d); /* Interrupt configuration for SGIs can't be changed */
if (gicirq < )
return -EINVAL; /* SPIs have restrictions on the supported types */
if (gicirq >= && type != IRQ_TYPE_LEVEL_HIGH &&
type != IRQ_TYPE_EDGE_RISING)
return -EINVAL; return gic_configure_irq(gicirq, type, base, NULL);
} static int gic_irq_set_irqchip_state(struct irq_data *d,
enum irqchip_irq_state which, bool val)
{
u32 reg; switch (which) {
case IRQCHIP_STATE_PENDING:
reg = val ? GIC_DIST_PENDING_SET : GIC_DIST_PENDING_CLEAR;
break; case IRQCHIP_STATE_ACTIVE:
reg = val ? GIC_DIST_ACTIVE_SET : GIC_DIST_ACTIVE_CLEAR;
break; case IRQCHIP_STATE_MASKED:
reg = val ? GIC_DIST_ENABLE_CLEAR : GIC_DIST_ENABLE_SET;
break; default:
return -EINVAL;
} gic_poke_irq(d, reg);
return ;
} static int gic_irq_get_irqchip_state(struct irq_data *d,
enum irqchip_irq_state which, bool *val)
{
switch (which) {
case IRQCHIP_STATE_PENDING:
*val = gic_peek_irq(d, GIC_DIST_PENDING_SET);
break; case IRQCHIP_STATE_ACTIVE:
*val = gic_peek_irq(d, GIC_DIST_ACTIVE_SET);
break; case IRQCHIP_STATE_MASKED:
*val = !gic_peek_irq(d, GIC_DIST_ENABLE_SET);
break; default:
return -EINVAL;
} return ;
}

irq_domain_alloc_irqs()调用__irq_domain_alloc_irqs()进行struct irq_desc、struct irq_data以及中断映射的处理。

这里的参数nr_irqs一般为1,每次只处理一个中断。

irq_domain_alloc_descs()->irq_alloc_descs()->__irq_alloc_descs()进行struct irq_desc的分配,返回的参数是Linux中断号。

int __irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base,
unsigned int nr_irqs, int node, void *arg,
bool realloc)
{
...
if (realloc && irq_base >= ) {
virq = irq_base;
} else {
virq = irq_domain_alloc_descs(irq_base, nr_irqs, , node);-------从allocated_irqs位图中查找第一个nr_irqs个空闲的比特位,最终调用__irq_alloc_descs
if (virq < ) {
pr_debug("cannot allocate IRQ(base %d, count %d)\n",
irq_base, nr_irqs);
return virq;
}
} if (irq_domain_alloc_irq_data(domain, virq, nr_irqs)) {--------------分配struct irq_data数据结构。
pr_debug("cannot allocate memory for IRQ%d\n", virq);
ret = -ENOMEM;
goto out_free_desc;
} mutex_lock(&irq_domain_mutex);
ret =irq_domain_alloc_irqs_recursive(domain, virq, nr_irqs, arg);----调用struct irq_domain中的alloc回调函数进行硬件中断号和软件中断号的映射。
if (ret < ) {
mutex_unlock(&irq_domain_mutex);
goto out_free_irq_data;
}
for (i = ; i < nr_irqs; i++)
irq_domain_insert_irq(virq + i);
mutex_unlock(&irq_domain_mutex); return virq;
...
} int __ref
__irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int node,
struct module *owner)
{
...
mutex_lock(&sparse_irq_lock); start = bitmap_find_next_zero_area(allocated_irqs, IRQ_BITMAP_BITS,
from, cnt, );-------------------在allocated_irqs位图中查找第一个连续cnt个为0的比特位区域。
...
bitmap_set(allocated_irqs, start, cnt);-------------bitmap_set()设置这些比特位,表示这些比特位已经被占用。
mutex_unlock(&sparse_irq_lock);
return alloc_descs(start, cnt, node, owner);--------这里要看是否定义了CONFIG_SPARSE_IRQ,如果定义了需要动态分配一个struct irq_desc数据结构,以Radix Tree方式存储;没有的话则从irq_desc全局变量中加上偏移即可。 err:
mutex_unlock(&sparse_irq_lock);
return ret;
}

irq_domain_alloc_irqs_recursive()会根据实际情况决定中断控制器的递归处理,

static int irq_domain_alloc_irqs_recursive(struct irq_domain *domain,
unsigned int irq_base,
unsigned int nr_irqs, void *arg)
{
int ret = ;
struct irq_domain *parent = domain->parent;
bool recursive = irq_domain_is_auto_recursive(domain); BUG_ON(recursive && !parent);
if (recursive)
ret = irq_domain_alloc_irqs_recursive(parent, irq_base,
nr_irqs, arg);
if (ret >= )
ret = domain->ops->alloc(domain, irq_base, nr_irqs, arg);
if (ret < && recursive)
irq_domain_free_irqs_recursive(parent, irq_base, nr_irqs); return ret;
}

至此完成了中断DeviceTree的解析,各数据结构的初始化,以及最主要的硬件中断号到Linux中断号的映射。

3. ARM底层中断处理

ARM底层中断处理的范围是从中断异常触发,到irq_handler。

3.1 中断硬件行为

外设有事件需要报告SoC时,通过和SoC链接的中断管脚发送中断信号,可能是边沿触发信号也可能是电平触发信号。

中断控制器会感知中断信号,中断控制器仲裁单元选择优先级最高的中断发送到CPU Interface,CPU Interface决定将中断分发到哪个CPU核心。

GIC控制器和CPU核心之间通过一个nIRQ(IRQ request input line)信号来通知CPU。

CPU核心感知到中断发生之后,硬件会做如下工作:

  • 保存中断发生时CPSR寄存器内容到SPSR_irq寄存器中
  • 修改CPSR寄存器,让CPU进入处理器模式(processor mode)中的IRQ模式,即修改CPSR寄存器中的M域设置为IRQ Mode。
  • 硬件自动关闭中断IRQ或FIQ,即CPSR中的IRQ位或FIQ位置1。------------硬件自动关中断
  • 保存返回地址到LR_irq寄存器中。
  • 硬件自动调转到中断向量表的IRQ向量。-------------------------------------------从此处开始进入软件领域

当从中断返回时需要软件实现如下操作:

  • 从SPSR_irq寄存器中恢复数据到CPSR中。
  • 从LR_irq中恢复内容到PC中,从而返回到中断点的下一个指令处执行。

3.2 中断异常向量

3.2.1 中断异常向量代码段初始化

内核编译时,异常向量表存放在可执行文件的__init段中:arch/arm/kernel/vmlinux.lds.S。

__vectors_start和__vectors_end指向vectors段的开始和结束地址,__stubs_start和__stubs_end存放异常向量stubs代码段。两者都是页面对齐,大小都为一个页面。

    __vectors_start = .;
.vectors : AT(__vectors_start) {
*(.vectors)----------------------------------保存.vectors段数据
}
. = __vectors_start + SIZEOF(.vectors);
__vectors_end = .; __stubs_start = .;
.stubs 0x1000 : AT(__stubs_start) {
*(.stubs)------------------------------------存放.stubs段数据
}
. = __stubs_start + SIZEOF(.stubs);
__stubs_end = .;

系统初始化时会把上述两个段复制到高端地址处,即ixffff_0000:start_kernel->setup_arch->paging_init->devicemap_init。

static void __init devicemaps_init(const struct machine_desc *mdesc)
{
struct map_desc map;
unsigned long addr;
void *vectors; /*
* Allocate the vector page early.
*/
vectors = early_alloc(PAGE_SIZE * );-------------------------------分配两个页面用于映射到high vectors高端地址。 early_trap_init(vectors);-------------------------------------------实现异常向量表的复制动作。...
/*
* Create a mapping for the machine vectors at the high-vectors
* location (0xffff0000). If we aren't using high-vectors, also
* create a mapping at the low-vectors virtual address.
*/
map.pfn = __phys_to_pfn(virt_to_phys(vectors));---------------------vectors物理页面号
map.virtual = 0xffff0000;-------------------------------------------待映射到的虚拟地址0xffff_0000~0xffff_0fff
map.length = PAGE_SIZE;---------------------------------------------映射区间大小
#ifdef CONFIG_KUSER_HELPERS
map.type = MT_HIGH_VECTORS;-----------------------------------------映射到high vector
#else
map.type = MT_LOW_VECTORS;
#endif
create_mapping(&map); if (!vectors_high()) {
map.virtual = ;
map.length = PAGE_SIZE * ;
map.type = MT_LOW_VECTORS;
create_mapping(&map);
} /* Now create a kernel read-only mapping */
map.pfn += ;
map.virtual = 0xffff0000 + PAGE_SIZE;------------------------------映射到0xffff_1000~0xffff_1ffff
map.length = PAGE_SIZE;
map.type = MT_LOW_VECTORS;
create_mapping(&map);
...
}

early_trap_init分别将__vectors_start和__stubs_start两个页面复制到分配的两个页面中。

void __init early_trap_init(void *vectors_base)
{
...
unsigned long vectors = (unsigned long)vectors_base;
extern char __stubs_start[], __stubs_end[];
extern char __vectors_start[], __vectors_end[];
unsigned i; vectors_page = vectors_base; /*
* Poison the vectors page with an undefined instruction. This
* instruction is chosen to be undefined for both ARM and Thumb
* ISAs. The Thumb version is an undefined instruction with a
* branch back to the undefined instruction.
*/
for (i = ; i < PAGE_SIZE / sizeof(u32); i++)
((u32 *)vectors_base)[i] = 0xe7fddef1;---------------------------第一个页面全部填充未定义指令0xe7fddef1。 /*
* Copy the vectors, stubs and kuser helpers (in entry-armv.S)
* into the vector page, mapped at 0xffff0000, and ensure these
* are visible to the instruction stream.
*/
memcpy((void *)vectors, __vectors_start, __vectors_end - __vectors_start);
memcpy((void *)vectors + 0x1000, __stubs_start, __stubs_end - __stubs_start);
...
}

3.2.2 中断异常向量

中断发生后,软件跳转到中断向量表开始vector_irq执行,vector_irq在结尾的时候根据中断发生点所在模式,决定跳转到__irq_usr或者__irq_svc。

vector_irq在arch/arm/kernel/entry-armv.S由宏vector_stub定义。

关于correction==4,需要减去4字节才是返回地址?

vector_stub宏参数correction为4,。

正在执行指令A时发生了中断,由于ARM流水线和指令预取等原因,pc指向A+8B处,那么必须等待指令A执行完毕才能处理该中断,这时PC已经更新到A+12B处。

进入中断响应前夕,pc寄存器的内容被装入lr寄存器中,lr=pc-4,即A+8B地址处。

因此返回时要pc=lr-4,才是被中断时要执行的下一条指令。所以lr要回退4B。

    .section .vectors, "ax", %progbits
__vectors_start:
W(b) vector_rst
W(b) vector_und
W(ldr) pc, __vectors_start + 0x1000
W(b) vector_pabt
W(b) vector_dabt
W(b) vector_addrexcptn
W(b) vector_irq---------------------------------------------------------------跳转到vector_irq
W(b) vector_fiq /*
* Interrupt dispatcher
*/
vector_stub irq, IRQ_MODE, 4------------------------------------------------vector_stub宏定义了vector_irq .long __irq_usr @ (USR_26 / USR_32)
.long __irq_invalid @ (FIQ_26 / FIQ_32)
.long __irq_invalid @ (IRQ_26 / IRQ_32)
.long __irq_svc@ (SVC_26 / SVC_32)----------------------------svc模式数值是0b10011,与上0xf后就是3。
.long __irq_invalid @
.long __irq_invalid @
.long __irq_invalid @
.long __irq_invalid @
.long __irq_invalid @
.long __irq_invalid @
.long __irq_invalid @ a
.long __irq_invalid @ b
.long __irq_invalid @ c
.long __irq_invalid @ d
.long __irq_invalid @ e
.long __irq_invalid @ f

.macro vector_stub, name, mode, correction=0------------------------------------vector_stub宏定义

    .align 5
vector_\name:
.if \correction
sub lr, lr, #\correction-------------------------------------------------------correction==4解释
.endif @
@ Save r0, lr_<exception> (parent PC) and spsr_<exception>
@ (parent CPSR)
@
stmia sp, {r0, lr} @ save r0, lr
mrs lr, spsr
str lr, [sp, #] @ save spsr @
@ Prepare for SVC32 mode. IRQs remain disabled.
@
mrs r0, cpsr
eor r0, r0, #(\mode ^ SVC_MODE | PSR_ISETSTATE)---------------------------------修改CPSR寄存器的控制域为SVC模式,为了使中断处理在SVC模式下执行。
msr spsr_cxsf, r0 @
@ the branch table must immediately follow this code
@
and lr, lr, #0x0f--------------------------------------------------------------低4位反映了进入中断前CPU的运行模式,9为USR,3为SVC模式。
THUMB( adr r0, 1f )
THUMB( ldr lr, [r0, lr, lsl #] )-------------------------------------------根据中断发生点所在的模式,给lr寄存器赋值,__irq_usr或者__irq_svc标签处。
mov r0, spk
ARM( ldr lr, [pc, lr, lsl #] )---------------------------------------------得到的lr就是".long __irq_svc"
movs pc, lr @ branch to handler in SVC mode-------------------------把lr的值赋给pc指针,跳转到__irq_usr或者__irq_svc。
ENDPROC(vector_\name)

3.3 内核空间中断处理__irq_svc

__irq_svc处理发生在内核空间的中断,主要svc_entry保护中断现场;irq_handler执行中断处理;如果打开抢占功能,检查是否可以抢占;最后svc_exit执行中断退出处理。

__irq_svc:
svc_entry
irq_handler #ifdef CONFIG_PREEMPT-----------------------------------------------------中断处理结束后,发生抢占的地方♥
get_thread_info tsk
ldr r8, [tsk, #TI_PREEMPT] @ get preempt count--------------获取thread_info->preempt_cpunt变量;preempt_count为0,说明可以抢占进程;preempt_count大于0,表示不能抢占。
ldr r0, [tsk, #TI_FLAGS] @ get flags------------------------获取thread_info->flags变量
teq r8, # @ if preempt count !=
movne r0, # @ force flags to
tst r0, #_TIF_NEED_RESCHED-----------------------------------------判断是否设置了_TIF_NEED_RESCHED标志位
blne svc_preempt
#endifsvc_exitr5, irq = @ return from exception
UNWIND(.fnend )
ENDPROC(__irq_svc)

svc_entry将中断现场保存到内核栈中,主要是struct pt_regs中的寄存器。

    .macro    svc_entry, stack_hole=, trace=
UNWIND(.fnstart )
UNWIND(.save {r0 - pc} )
sub sp, sp, #(S_FRAME_SIZE + \stack_hole - )
#ifdef CONFIG_THUMB2_KERNEL
SPFIX( str r0, [sp] ) @ temporarily saved
SPFIX( mov r0, sp )
SPFIX( tst r0, # ) @ test original stack alignment
SPFIX( ldr r0, [sp] ) @ restored
#else
SPFIX( tst sp, # )
#endif
SPFIX( subeq sp, sp, # )
stmia sp, {r1 - r12} ldmia r0, {r3 - r5}
add r7, sp, #S_SP - @ here for interlock avoidance
mov r6, #- @ "" "" "" ""
add r2, sp, #(S_FRAME_SIZE + \stack_hole - )
SPFIX( addeq r2, r2, # )
str r3, [sp, #-]! @ save the "real" r0 copied
@ from the exception stack mov r3, lr @
@ We are now ready to fill in the remaining blanks on the stack:
@
@ r2 - sp_svc
@ r3 - lr_svc
@ r4 - lr_<exception>, already fixed up for correct return/restart
@ r5 - spsr_<exception>
@ r6 - orig_r0 (see pt_regs definition in ptrace.h)
@
stmia r7, {r2 - r6} .if \trace
#ifdef CONFIG_TRACE_IRQFLAGS
bl trace_hardirqs_off
#endif
.endif
.endm

svc_exit准备返回中断现场,然后通过ldmia指令从栈中恢复15个寄存器,包括pc内容,至此整个中断完成并返回。

    .macro    svc_exit, rpsr, irq = ...
msr spsr_cxsf, \rpsr
ldmia sp, {r0 - pc}^ @ load r0 - pc, cpsr
.endm

irq_handler进入高层中断处理。

4. 高层中断处理

irq_handler汇编宏是ARCH层和高层中断处理分割线,在这里从汇编跳转到C进行GIC相关处理。

前面介绍了一个中断是如何从硬件中断号映射到Linux中断号的,那么当一个中断产生后它从应将到软件识别中断号,再到转换成Linux中断号是什么路径呢?

这里就从irq_handler开始分析流程:

irq_handler()

->handle_arch_irq()->gic_handle_irq()

->handle_domain_irq()->__handle_domain_irq()-------------读取IAR寄存器,响应中断,获取硬件中断号

->irq_find_mapping()------------------------------------------------将硬件中断号转变成Linux中断号

->generic_handle_irq()---------------------------------------------之后的操作都是Linux中断号

->handle_percpu_devid_irq()-----------------------------------SGI/PPI类型中断处理

->handle_fasteoi_irq()--------------------------------------------SPI类型中断处理

->handle_irq_event()->handle_irq_event_percpu()------执行中断处理核心函数

->action->handler-----------------------------------------------执行primary handler。

->__irq_wake_thread()----------------------------------------根据需要唤醒中断内核线程

4.1 irq_handler

irq_handler宏调用handle_arch_irq函数,这个函数set_handle_irq注册,GICv2对应gic_handle_irq。

    .macro    irq_handler
#ifdef CONFIG_MULTI_IRQ_HANDLER
ldr r1, =handle_arch_irq
mov r0, sp
adr lr, BSYM(9997f)
ldr pc, [r1]
#else
arch_irq_handler_default
#endif
:
.endm

4.2 gic_handle_irq

git_init_bases设置handle_arch_irq为gic_handle_irq。

void __init gic_init_bases(unsigned int gic_nr, int irq_start,
void __iomem *dist_base, void __iomem *cpu_base,
u32 percpu_offset, struct device_node *node)
{
...
if (gic_nr == 0) {
...
set_handle_irq(gic_handle_irq);
}
...
} void __init set_handle_irq(void (*handle_irq)(struct pt_regs *))
{
if (handle_arch_irq)
return; handle_arch_irq = handle_irq;
}

gic_handle_irq对将中断分为两组:SGI、PPI/SPI。

SGI类型中断交给handle_IPI()处理;PPI/SPI类型交给handle_domain_irq处理。

static void __exception_irq_entry gic_handle_irq(struct pt_regs *regs)
{
u32 irqstat, irqnr;
struct gic_chip_data *gic = &gic_data[];
void __iomem *cpu_base = gic_data_cpu_base(gic); do {
irqstat = readl_relaxed(cpu_base + GIC_CPU_INTACK);---读取IAR寄存器,表示响应中断。
irqnr = irqstat & GICC_IAR_INT_ID_MASK;-----------------GICC_IAR_INT_ID_MASK为0x3ff,即低10位,所以中断最多从0~1023。 if (likely(irqnr > && irqnr < )) {
handle_domain_irq(gic->domain, irqnr, regs);
continue;
}
if (irqnr < ) {---------------------------------------SGI类型的中断是CPU核间通信所用,只有定义了CONFIG_SMP才有意义。
writel_relaxed(irqstat, cpu_base + GIC_CPU_EOI);----直接写EOI寄存器,表示结束中断。
#ifdef CONFIG_SMP
handle_IPI(irqnr, regs);----------------------------irqnr表示SGI中断类型
#endif
continue;
}
break;
} while ();
}

handle_domain_irq调用__handle_domain_irq,其中lookup置为true。

irq_enter显式告诉Linux内核现在要进入中断上下文了,在处理完中断后调用irq_exit告诉Linux已经完成中断处理过程。

int __handle_domain_irq(struct irq_domain *domain, unsigned int hwirq,
bool lookup, struct pt_regs *regs)
{
struct pt_regs *old_regs = set_irq_regs(regs);
unsigned int irq = hwirq;
int ret = ; irq_enter();-----------------------------------------------通过显式增加hardirq域计数,通知Linux进入中断上下文 #ifdef CONFIG_IRQ_DOMAIN
if (lookup)
irq =irq_find_mapping(domain, hwirq);-----------------根据硬件中断号找到对应的软件中断号
#endif /*
* Some hardware gives randomly wrong interrupts. Rather
* than crashing, do something sensible.
*/
if (unlikely(!irq || irq >= nr_irqs)) {
ack_bad_irq(irq);
ret = -EINVAL;
} else {
generic_handle_irq(irq);--------------------------------开始具体某一个中断的处理,此处irq已经是Linux中断号。
} irq_exit();-------------------------------------------------退出中断上下文
set_irq_regs(old_regs);
return ret;
}

irq_find_mapping在struct irq_domain中根据hwirq找到Linux环境的irq。

unsigned int irq_find_mapping(struct irq_domain *domain,
irq_hw_number_t hwirq)
{
struct irq_data *data;
...
/* Check if the hwirq is in the linear revmap. */
if (hwirq < domain->revmap_size)
return domain->linear_revmap[hwirq];----------------linear_revmap[]在__irq_domain_alloc_irqs()->irq_domain_insert_irq()时赋值。
...
}

generic_handle_irq参数是irq号,irq_to_desc()根据irq号找到对应的struct irq_desc。

然后调用irq_desc->handle_irq处理对应的中断。

int generic_handle_irq(unsigned int irq)
{
struct irq_desc *desc = irq_to_desc(irq); if (!desc)
return -EINVAL;
generic_handle_irq_desc(irq, desc);
return ;
} static inline void generic_handle_irq_desc(unsigned int irq, struct irq_desc *desc)
{
desc->handle_irq(irq, desc);
}

关于desc->handle_irq来历,在每个中断注册的时候,由gic_irq_domain_map根据hwirq号决定。

gic_irq_domain_map的时候根据hw号决定handle,hw硬件中断号小于32指向handle_percpu_devid_irq,其他情况指向handle_fasteoi_irq

void
__irq_set_handler(unsigned int irq, irq_flow_handler_t handle, int is_chained,
const char *name)
{
...
desc->handle_irq = handle;
desc->name = name;
...
}

handle_percpu_devid_irq处理0~31的SGI/PPI类型中断,首先响应IAR,然后执行handler,最后发送EOI。

void handle_percpu_devid_irq(unsigned int irq, struct irq_desc *desc)
{
struct irq_chip *chip = irq_desc_get_chip(desc);
struct irqaction *action = desc->action;
void *dev_id = raw_cpu_ptr(action->percpu_dev_id);
irqreturn_t res; kstat_incr_irqs_this_cpu(irq, desc); if (chip->irq_ack)
chip->irq_ack(&desc->irq_data); trace_irq_handler_entry(irq, action);
res = action->handler(irq, dev_id);
trace_irq_handler_exit(irq, action, res); if (chip->irq_eoi)
chip->irq_eoi(&desc->irq_data);-------------------调用gic_eoi_irq()函数
}

irq_enter和irq_exit显式地处理hardirq域计数,两者之间的部分属于中断上下文。

/*
* Enter an interrupt context.
*/
void irq_enter(void)
{
rcu_irq_enter();
if (is_idle_task(current) && !in_interrupt()) {
/*
* Prevent raise_softirq from needlessly waking up ksoftirqd
* here, as softirq will be serviced on return from interrupt.
*/
local_bh_disable();
tick_irq_enter();
_local_bh_enable();
} __irq_enter();---------------------------------------------显式增加hardirq域计数
} #define __irq_enter() \
do { \
account_irq_enter_time(current); \
preempt_count_add(HARDIRQ_OFFSET); \----------------显式增加hardirq域计数
trace_hardirq_enter(); \
} while () void irq_exit(void)
{
#ifndef __ARCH_IRQ_EXIT_IRQS_DISABLED
local_irq_disable();
#else
WARN_ON_ONCE(!irqs_disabled());
#endif account_irq_exit_time(current);
preempt_count_sub(HARDIRQ_OFFSET);---------------------------显式减少hardirq域计数
if (!in_interrupt() && local_softirq_pending())--------------当前不处于中断上下文,且有pending的softirq,进行softirq处理。
invoke_softirq(); tick_irq_exit();
rcu_irq_exit();
trace_hardirq_exit(); /* must be last! */
}

4.2.1 中断上下文

判断当前进程是处于中断上下文,还是进程上下文依赖于preempt_count,这个变量在struct thread_info中。

preempt_count计数共32bit,从低到高依次是:

#define PREEMPT_BITS	8
#define SOFTIRQ_BITS 8
#define HARDIRQ_BITS 4
#define NMI_BITS 1
#define hardirq_count()    (preempt_count() & HARDIRQ_MASK)-----------------硬件中断计数
#define softirq_count() (preempt_count() & SOFTIRQ_MASK)-----------------软中断计数
#define irq_count() (preempt_count() & (HARDIRQ_MASK | SOFTIRQ_MASK \----包括NMI、硬中断、软中断三者计数
| NMI_MASK)) /*
* Are we doing bottom half or hardware interrupt processing?
*
* in_irq() - We're in (hard) IRQ context
* in_softirq() - We have BH disabled, or are processing softirqs
* in_interrupt() - We're in NMI,IRQ,SoftIRQ context or have BH disabled
* in_serving_softirq() - We're in softirq context
* in_nmi() - We're in NMI context
* in_task() - We're in task context
*
* Note: due to the BH disabled confusion: in_softirq(),in_interrupt() really
* should not be used in new code.
*/
#define in_irq() (hardirq_count())----------------------------判断是否正在硬件中断上下文
#define in_softirq() (softirq_count())------------------------判断是否正在处理软中断或者禁止BH。
#define in_interrupt() (irq_count())--------------------------判断是否处于NMI、硬中断、软中断三者之一或者兼有上下文
#define in_serving_softirq() (softirq_count() & SOFTIRQ_OFFSET)---判断是否处于软中断上下文。
#define in_nmi() (preempt_count() & NMI_MASK)-----------------判断是否处于NMI上下文
#define in_task() (!(preempt_count() & \
(NMI_MASK | HARDIRQ_MASK | SOFTIRQ_OFFSET)))------判断是否处于进程上下文

思考:in_softirq()和in_serving_softirq()区别?in_interrupt()和in_task()中关于SOFTIRQ_MASK和SOFTIRQ_OFFSET区别?

4.3 handle_fasteoi_irq

handle_fsteoi_irq处理SPI类型的中断,将主要工作交给handle_irq_event()。

handle_irq_event_percpu()首先处理action->handler,有需要则唤醒中断内核线程,执行action->thread_fn。

void
handle_fasteoi_irq(unsigned int irq, struct irq_desc *desc)
{
struct irq_chip *chip = desc->irq_data.chip; raw_spin_lock(&desc->lock); if (!irq_may_run(desc))
goto out; desc->istate &= ~(IRQS_REPLAY | IRQS_WAITING);
kstat_incr_irqs_this_cpu(irq, desc); /*
* If its disabled or no action available
* then mask it and get out of here:
*/
if (unlikely(!desc->action || irqd_irq_disabled(&desc->irq_data))) {---如果该中断没有指定action描述符或该中断被关闭了IRQD_IRQ_DISABLED,设置该中断状态为IRQS_PENDING,且mask_irq()屏蔽该中断。
desc->istate |= IRQS_PENDING;
mask_irq(desc);
goto out;
} if (desc->istate & IRQS_ONESHOT)----------------------------------------如果中断是IRQS_ONESHOT,不支持中断嵌套,那么应该调用mask_irq()来屏蔽该中断源。
mask_irq(desc); preflow_handler(desc);--------------------------------------------------取决于是否定义了freflow_handler()
handle_irq_event(desc); cond_unmask_eoi_irq(desc, chip);----------------------------------------根据不同条件执行unmask_irq()解除中断屏蔽,或者执行irq_chip->irq_eoi发送EOI信号,通知GIC中断处理完毕。 raw_spin_unlock(&desc->lock);
return;
out:
if (!(chip->flags & IRQCHIP_EOI_IF_HANDLED))
chip->irq_eoi(&desc->irq_data);
raw_spin_unlock(&desc->lock);
}

handle_irq_event调用handle_irq_event_percpu,执行action->handler(),如有需要唤醒内核中断线程执行action->thread_fn。

irqreturn_t handle_irq_event(struct irq_desc *desc)
{
struct irqaction *action = desc->action;
irqreturn_t ret; desc->istate &= ~IRQS_PENDING;--------------------------清除IRQS_PENDING标志位
irqd_set(&desc->irq_data, IRQD_IRQ_INPROGRESS);---------设置IRQD_IRQ_INPROGRESS标志位,表示正在处理硬件中断。
raw_spin_unlock(&desc->lock); ret =handle_irq_event_percpu(desc, action); raw_spin_lock(&desc->lock);
irqd_clear(&desc->irq_data, IRQD_IRQ_INPROGRESS);-------清除IRQD_IRQ_INPROGRESS标志位,表示中断处理结束。
return ret;
} irqreturn_t
handle_irq_event_percpu(struct irq_desc *desc, struct irqaction *action)
{
irqreturn_t retval = IRQ_NONE;
unsigned int flags = , irq = desc->irq_data.irq; do {----------------------------------------------------遍历中断描述符中的action链表,依次执行每个action元素中的primary handler回调函数action->handler。
irqreturn_t res; trace_irq_handler_entry(irq, action);
res = action->handler(irq, action->dev_id);---------执行struct irqaction的handler函数。
trace_irq_handler_exit(irq, action, res); if (WARN_ONCE(!irqs_disabled(),"irq %u handler %pF enabled interrupts\n",
irq, action->handler))
local_irq_disable();--------------------------- switch (res) {
case IRQ_WAKE_THREAD:-------------------------------去唤醒内核中断线程
/*
* Catch drivers which return WAKE_THREAD but
* did not set up a thread function
*/
if (unlikely(!action->thread_fn)) {
warn_no_thread(irq, action);----------------输出一个打印表示没有中断处理函数
break;
} __irq_wake_thread(desc, action);----------------唤醒此中断对应的内核线程 /* Fall through to add to randomness */
case IRQ_HANDLED:-----------------------------------已经处理完毕,可以结束。
flags |= action->flags;
break; default:
break;
} retval |= res;
action = action->next;
} while (action); add_interrupt_randomness(irq, flags); if (!noirqdebug)
note_interrupt(irq, desc, retval);
return retval;
}

4.3.1 唤醒中断内核线程

__irq_wake_thread唤醒对应中断的内核线程。

void __irq_wake_thread(struct irq_desc *desc, struct irqaction *action)
{
/*
* In case the thread crashed and was killed we just pretend that
* we handled the interrupt. The hardirq handler has disabled the
* device interrupt, so no irq storm is lurking.
*/
if (action->thread->flags & PF_EXITING)
return; /*
* Wake up the handler thread for this action. If the
* RUNTHREAD bit is already set, nothing to do.
*/
if (test_and_set_bit(IRQTF_RUNTHREAD, &action->thread_flags))--------------若已经对IRQF_RUNTHREAD置位,表示已经处于唤醒中,该函数直接返回。
return; desc->threads_oneshot |= action->thread_mask;--------------------thread_mask在共享中断中,每一个action有一个比特位来表示。thread_oneshot每个比特位表示正在处理的共享oneshot类型中断的中断线程。 atomic_inc(&desc->threads_active);-------------------------------活跃中断线程计数 wake_up_process(action->thread);---------------------------------唤醒action的thread内核线程
}

4.3.2 创建内核中断线程

irq_thread在中断注册的时候,如果条件满足同时创建rq/xx-xx内核中断线程,线程优先级是49(99-50),调度策略是SCHED_FIFO。

static int
__setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
{
...
/*
* Create a handler thread when a thread function is supplied
* and the interrupt does not nest into another interrupt
* thread.
*/
if (new->thread_fn && !nested) {
struct task_struct *t;
static const struct sched_param param = {
.sched_priority = MAX_USER_RT_PRIO/,-------------------------------设置irq内核线程的优先级,在/proc/xxx/sched中看到的prio为MAX_RT_PRIO-1-sched_priority。
}; t = kthread_create(irq_thread, new, "irq/%d-%s", irq,
new->name);--------------------------------------------------创建线程名为irq/xxx-xxx的内核线程,线程执行函数是irq_thread。
...
sched_setscheduler_nocheck(t, SCHED_FIFO, &param);----------------------设置进程调度策略为SCHED_FIFO。 /*
* We keep the reference to the task struct even if
* the thread dies to avoid that the interrupt code
* references an already freed task_struct.
*/
get_task_struct(t);
new->thread = t;-------------------------------------------------------将当前线程和irq_action关联起来 set_bit(IRQTF_AFFINITY, &new->thread_flags);--------------------------对中断线程设置CPU亲和性
}
...
}

4.3.3 内核中断线程执行

irq_thread是中断线程的执行函数,在irq_wait_for_interrupt()中等待。

irq_wait_for_interrupt()中判断IRQTF_RUNTHREAD标志位,没有置位则schedule()换出CPU,进行睡眠。

直到__irq_wake_thread()置位了IRQTF_RUNTHREAD,并且wake_up_process()后,irq_wait_for_interrupt()返回0。

static int irq_thread(void *data)
{
struct callback_head on_exit_work;
struct irqaction *action = data;
struct irq_desc *desc = irq_to_desc(action->irq);
irqreturn_t (*handler_fn)(struct irq_desc *desc,
struct irqaction *action); if (force_irqthreads && test_bit(IRQTF_FORCED_THREAD,
&action->thread_flags))
handler_fn = irq_forced_thread_fn;
else
handler_fn =irq_thread_fn; init_task_work(&on_exit_work, irq_thread_dtor);
task_work_add(current, &on_exit_work, false); irq_thread_check_affinity(desc, action); while (!irq_wait_for_interrupt(action)) {
irqreturn_t action_ret; irq_thread_check_affinity(desc, action); action_ret = handler_fn(desc, action);-----------执行中断内核线程函数
if (action_ret == IRQ_HANDLED)
atomic_inc(&desc->threads_handled);----------增加threads_handled计数 wake_threads_waitq(desc);------------------------唤醒wait_for_threads等待队列
} /*
* This is the regular exit path. __free_irq() is stopping the
* thread via kthread_stop() after calling
* synchronize_irq(). So neither IRQTF_RUNTHREAD nor the
* oneshot mask bit can be set. We cannot verify that as we
* cannot touch the oneshot mask at this point anymore as
* __setup_irq() might have given out currents thread_mask
* again.
*/
task_work_cancel(current, irq_thread_dtor);
return ;
} static int irq_wait_for_interrupt(struct irqaction *action)
{
set_current_state(TASK_INTERRUPTIBLE); while (!kthread_should_stop()) { if (test_and_clear_bit(IRQTF_RUNTHREAD,
&action->thread_flags)) {------------判断thread_flags是否设置IRQTF_RUNTHREAD标志位,如果设置则设置当前状态TASK_RUNNING并返回0。此处和__irq_wake_thread中设置IRQTF_RUNTHREAD对应。
__set_current_state(TASK_RUNNING);
return ;
}
schedule();-----------------------------------------换出CPU,在此等待睡眠
set_current_state(TASK_INTERRUPTIBLE);
}
__set_current_state(TASK_RUNNING);
return -;
} static irqreturn_t irq_thread_fn(struct irq_desc *desc,
struct irqaction *action)
{
irqreturn_t ret; ret = action->thread_fn(action->irq, action->dev_id);---执行中断内核线程函数,为request_threaded_irq注册中断参数thread_fn。
irq_finalize_oneshot(desc, action);---------------------针对oneshot类型中断收尾处理,主要是去屏蔽中断。
return ret;
}

irq_finalize_oneshot()对ontshot类型的中断进行收尾操作。

static void irq_finalize_oneshot(struct irq_desc *desc,
struct irqaction *action)
{
if (!(desc->istate & IRQS_ONESHOT) ||
action->handler == irq_forced_secondary_handler)
return;
again:
chip_bus_lock(desc);
raw_spin_lock_irq(&desc->lock); /*
* Implausible though it may be we need to protect us against
* the following scenario:
*
* The thread is faster done than the hard interrupt handler
* on the other CPU. If we unmask the irq line then the
* interrupt can come in again and masks the line, leaves due
* to IRQS_INPROGRESS and the irq line is masked forever.
*
* This also serializes the state of shared oneshot handlers
* versus "desc->threads_onehsot |= action->thread_mask;" in
* irq_wake_thread(). See the comment there which explains the
* serialization.
*/
if (unlikely(irqd_irq_inprogress(&desc->irq_data))) {-----------必须等待硬件中断处理程序清除IRQD_IRQ_INPROGRESS标志位,见handle_irq_event()。因为该标志位表示硬件中断处理程序正在处理硬件中断,直到硬件中断处理完毕才会清除该标志。
raw_spin_unlock_irq(&desc->lock);
chip_bus_sync_unlock(desc);
cpu_relax();
goto again;
} /*
* Now check again, whether the thread should run. Otherwise
* we would clear the threads_oneshot bit of this thread which
* was just set.
*/
if (test_bit(IRQTF_RUNTHREAD, &action->thread_flags))
goto out_unlock; desc->threads_oneshot &= ~action->thread_mask; if (!desc->threads_oneshot && !irqd_irq_disabled(&desc->irq_data) &&
irqd_irq_masked(&desc->irq_data))
unmask_threaded_irq(desc);----------------------------------执行EOI或者去中断屏蔽。 out_unlock:
raw_spin_unlock_irq(&desc->lock);
chip_bus_sync_unlock(desc);
}

至此一个中断的执行完毕。

4.4 如何保证IRQS_ONESHOT不嵌套?

5. 注册中断

5.1 中断、线程、中断线程化

中断处理程序包括上半部硬件中断处理程序,下半部处理机制,包括软中断、tasklet、workqueue、中断线程化。

当一个外设中断发生后,内核会执行一个函数来响应该中断,这个函数通常被称为中断处理程序或中断服务例程。

上半部硬件中断处理运行在中断上下文中,要求快速完成并且退出中断。

中断线程化是实时Linux项目开发的一个新特性,目的是降低中断处理对系统实时延迟的影响。

在LInux内核里,中断具有最高优先级,只要有中断发生,内核会暂停手头的工作转向中断处理,等到所有挂起等待的中断和软终端处理完毕后才会执行进程调度,因此这个过程会造成实时任务得不到及时处理。

中断上下文总是抢占进程上下文,中断上下文不仅是中断处理程序,还包括softirq、tasklet等,中断上下文成了优化Linux实时性的最大挑战之一。

5.2 中断注册接口

IRQF_*描述的中断标志位用于request_threaded_irq()申请中断时描述该中断的特性。

IRQS_*的中断标志位是位于struct irq_desc数据结构的istate成员,也即core_internal_state__do_not_mess_with_it

IRQD_*是struct irq_data数据结构中的state_use_accessors成员一组中断标志位,通常用于描述底层中断状态。

关于IRQF_ONESHOT特别解释:必须在硬件中断处理结束之后才能重新使能中断;线程化中断处理过程中保持中断线处于关闭状态,直到该中断线上所有thread_fn执行完毕。

#define IRQF_TRIGGER_NONE    0x00000000
#define IRQF_TRIGGER_RISING 0x00000001---------------------------上升沿触发
#define IRQF_TRIGGER_FALLING 0x00000002--------------------------下降沿触发
#define IRQF_TRIGGER_HIGH 0x00000004-----------------------------高电平触发
#define IRQF_TRIGGER_LOW 0x00000008------------------------------地电平触发
#define IRQF_TRIGGER_MASK (IRQF_TRIGGER_HIGH | IRQF_TRIGGER_LOW | \
IRQF_TRIGGER_RISING | IRQF_TRIGGER_FALLING)--------四种触发类型
#define IRQF_TRIGGER_PROBE 0x00000010 #define IRQF_SHARED 0x00000080-------------------------------多个设备共享一个中断号
#define IRQF_PROBE_SHARED 0x00000100-----------------------------中断处理程序允许sharing mismatch发生
#define __IRQF_TIMER 0x00000200------------------------------标记一个时钟中断
#define IRQF_PERCPU 0x00000400-------------------------------属于某个特定CPU的中断
#define IRQF_NOBALANCING 0x00000800------------------------------禁止在多CPU之间做中断均衡
#define IRQF_IRQPOLL 0x00001000------------------------------中断被用作轮询
#define IRQF_ONESHOT 0x00002000------------------------------一次性触发中断,不允许嵌套。
#define IRQF_NO_SUSPEND 0x00004000---------------------------在系统睡眠过程中不要关闭该中断
#define IRQF_FORCE_RESUME 0x00008000-----------------------------在系统唤醒过程中必须抢孩子打开该中断
#define IRQF_NO_THREAD 0x00010000----------------------------表示该中断不会给线程化
#define IRQF_EARLY_RESUME 0x00020000
#define IRQF_COND_SUSPEND 0x00040000 #define IRQF_TIMER (__IRQF_TIMER | IRQF_NO_SUSPEND | IRQF_NO_THREAD) enum {
IRQS_AUTODETECT = 0x00000001,-------------------处于自动侦测状态
IRQS_SPURIOUS_DISABLED = 0x00000002,----------------被视为“伪中断”并被禁用
IRQS_POLL_INPROGRESS = 0x00000008,------------------正处于轮询调用action
IRQS_ONESHOT = 0x00000020,----------------------表示只执行一次,由IRQF_ONESHOT转换而来,在中断线程化执行完成后需要小心对待,见irq_finalize_oneshot()。
IRQS_REPLAY = 0x00000040,-----------------------重新发送一次中断
IRQS_WAITING = 0x00000080,----------------------处于等待状态
IRQS_PENDING = 0x00000200,----------------------该中断被挂起
IRQS_SUSPENDED = 0x00000800,--------------------该中断被暂停
}; enum {
IRQD_TRIGGER_MASK = 0xf,-------------------------该中断触发类型
IRQD_SETAFFINITY_PENDING = ( << ),
IRQD_NO_BALANCING = ( << ),
IRQD_PER_CPU = ( << ),
IRQD_AFFINITY_SET = ( << ),
IRQD_LEVEL = ( << ),
IRQD_WAKEUP_STATE = ( << ),
IRQD_MOVE_PCNTXT = ( << ),
IRQD_IRQ_DISABLED = ( << ),--------------------该中断处于关闭状态
IRQD_IRQ_MASKED = ( << ),------------------该中断被屏蔽中
IRQD_IRQ_INPROGRESS = ( << ),------------------该中断正在被处理中
IRQD_WAKEUP_ARMED = ( << ),
IRQD_FORWARDED_TO_VCPU = ( << ),
};

struct irqaction是每个中断的irqaction描述符。

struct irqaction {
irq_handler_t handler;-----------primary handler函数指针
void *dev_id;----------------传递给中断处理程序的参数
void __percpu *percpu_dev_id;
struct irqaction *next;
irq_handler_t thread_fn;---------中断线程处理程序的函数指针
struct task_struct *thread;----------中断线程的task_struct数据结构
unsigned int irq;----------------Linux软件中断号
unsigned int flags;--------------注册中断时用的中断标志位,IRQF_*。
unsigned long thread_flags;------中断线程相关标志位
unsigned long thread_mask;-------在共享中断中,每一个action有一个比特位来表示。
const char *name;----------------中断线程名称
struct proc_dir_entry *dir;
} ____cacheline_internodealigned_in_smp;

request_irq调用request_threaded_irq进行中断注册,只是少了一个thread_fn参数。这也是两则的区别所在,request_irq不能注册线程化中断。

irq:Linux软件中断号,不是硬件中断号。

handler:指primary handler,也即request_irq的中断处理函数handler。

thread_fn:中断线程化的处理函数。

irqflags:中断标志位,见IRQF_*解释。

devname:中断名称。

dev_id:传递给中断处理程序的参数。

handler和thread_fn分别被赋给action->handler和action->thread_fn,组合如下:

  handler thread_fn  
1 先执行handler,然后条件执行thread_fn。
2 × 等同于request_irq()
3 × handler=irq_default_primary_handler
4 × × 返回-EINVAL

很多request_threaded_irq()使用第3种组合,irq_default_primary_handler()返回IRQ_WAKE_THREAD,将工作交给thread_fn进行处理。

第2种组合相当于request_irq()。

第4种组合不被允许,因为中断得不到任何处理。

第1种组合较复杂,在handler根据实际情况返回IRQ_WAKE_THREAD(唤醒内核中断线程)或者IRQ_HANDLED(中断已经处理完毕,不需要唤醒中断内核线程)。

request_threaded_irq()对参数进行检查之后,分配struct irqaction并填充,然后将注册工作交给__setup_irq()。

static inline int __must_check
request_irq(unsigned int irq, irq_handler_t handler, unsigned long flags,
const char *name, void *dev)
{
returnrequest_threaded_irq(irq, handler, NULL, flags, name, dev);
} int request_threaded_irq(unsigned int irq, irq_handler_t handler,
irq_handler_t thread_fn, unsigned long irqflags,
const char *devname, void *dev_id)
{
...
if (((irqflags & IRQF_SHARED) && !dev_id) ||-----------------------------共享中断设备必须传递啊dev_id参数来区分是哪个共享外设的中断
(!(irqflags & IRQF_SHARED) && (irqflags & IRQF_COND_SUSPEND)) ||
((irqflags & IRQF_NO_SUSPEND) && (irqflags & IRQF_COND_SUSPEND)))
return -EINVAL; desc = irq_to_desc(irq);--------------------------------------------------通过Linux中断号找到对应中断描述符struct irq_desc。
if (!desc)
return -EINVAL;
...
if (!handler) {
if (!thread_fn)
return -EINVAL;---------------------------------------------------handler和thread_fn不能同时为NULL
handler = irq_default_primary_handler;--------------------------------没有设置handler,irq_default_primary_handler()默认返回IRQ_WAKE_THREAD。
} action = kzalloc(sizeof(struct irqaction), GFP_KERNEL);-------------------分配struct irqaction,并填充相应成员
if (!action)
return -ENOMEM; action->handler = handler;
action->thread_fn = thread_fn;
action->flags = irqflags;
action->name = devname;
action->dev_id = dev_id; chip_bus_lock(desc);-------------------------------------------------------调用desc->irq_data.chip->irq_bus_lock()进行加锁保护
retval =__setup_irq(irq, desc, action);
chip_bus_sync_unlock(desc); if (retval)
kfree(action);
...
return retval;
}

5.3 __setup_irq

一张图

__setup_irq()首先做参数检查,然后根据需要创建中断内核线程,这期间处理中断嵌套、oneshot、中断共享等问题。

还设置了中断触发类型设置,中断使能等工作。最后根据需要唤醒中断内核线程,并创建此中断相关sysfs节点。

/*
* Internal function to register an irqaction - typically used to
* allocate special interrupts that are part of the architecture.
*/
static int
__setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
{
struct irqaction *old, **old_ptr;
unsigned long flags, thread_mask = ;
int ret, nested, shared = ;
cpumask_var_t mask; if (!desc)
return -EINVAL; if (desc->irq_data.chip == &no_irq_chip)----------------------表示没有正确初始化中断控制器,对于GICv2在gic_irq_domain_alloc()中指定chip为gic_chip
return -ENOSYS;
if (!try_module_get(desc->owner))
return -ENODEV; /*
* Check whether the interrupt nests into another interrupt
* thread.
*/
nested = irq_settings_is_nested_thread(desc);-----------------对于设置了_IRQ_NESTED_THREAD嵌套类型的中断描述符,必须指定thread_fn。
if (nested) {
if (!new->thread_fn) {
ret = -EINVAL;
goto out_mput;
}
/*
* Replace the primary handler which was provided from
* the driver for non nested interrupt handling by the
* dummy function which warns when called.
*/
new->handler = irq_nested_primary_handler;
} else {
if (irq_settings_can_thread(desc))-----------------------判断该中断是否可以被线程化,如果没有设置_IRQ_NOTHREAD表示可以被强制线程化。
irq_setup_forced_threading(new);
} /*
* Create a handler thread when a thread function is supplied
* and the interrupt does not nest into another interrupt
* thread.
*/
if (new->thread_fn && !nested) {-----------------------------对不支持嵌套的线程化中断创建一个内核线程,实时SCHED_FIFO,优先级为50的实时线程。
struct task_struct *t;
static const struct sched_param param = {
.sched_priority = MAX_USER_RT_PRIO/,
}; t = kthread_create(irq_thread, new, "irq/%d-%s", irq,
new->name);-----------------------------------由irq、中断号、中断名组成的中断线程名,处理函数是irq_thread()。
if (IS_ERR(t)) {
ret = PTR_ERR(t);
goto out_mput;
} sched_setscheduler_nocheck(t, SCHED_FIFO, &param); get_task_struct(t);
new->thread = t; set_bit(IRQTF_AFFINITY, &new->thread_flags);
} if (!alloc_cpumask_var(&mask, GFP_KERNEL)) {
ret = -ENOMEM;
goto out_thread;
} /*
* Drivers are often written to work w/o knowledge about the
* underlying irq chip implementation, so a request for a
* threaded irq without a primary hard irq context handler
* requires the ONESHOT flag to be set. Some irq chips like
* MSI based interrupts are per se one shot safe. Check the
* chip flags, so we can avoid the unmask dance at the end of
* the threaded handler for those.
*/
if (desc->irq_data.chip->flags & IRQCHIP_ONESHOT_SAFE)----------表示该中断控制器不支持中断嵌套,所以flags去掉IRQF_ONESHOT。
new->flags &= ~IRQF_ONESHOT; raw_spin_lock_irqsave(&desc->lock, flags);
old_ptr = &desc->action;
old = *old_ptr;
if (old) {-----------------------------------------------------old指向desc->action指向的链表,old不为空说明已经有中断添加到中断描述符irq_desc中,说明这是一个共享中断。shared=1。
...
/* add new interrupt at end of irq queue */
do {
/*
* Or all existing action->thread_mask bits,
* so we can find the next zero bit for this
* new action.
*/
thread_mask |= old->thread_mask;
old_ptr = &old->next;
old = *old_ptr;
} while (old);
shared = ;
} /*
* Setup the thread mask for this irqaction for ONESHOT. For
* !ONESHOT irqs the thread mask is 0 so we can avoid a
* conditional in irq_wake_thread().
*/
if (new->flags & IRQF_ONESHOT) {
/*
* Unlikely to have 32 resp 64 irqs sharing one line,
* but who knows.
*/
if (thread_mask == ~0UL) {
ret = -EBUSY;
goto out_mask;
} new->thread_mask = << ffz(thread_mask); } else if (new->handler == irq_default_primary_handler &&---------非IRQF_ONESHOT类型中断,且handler使用默认irq_default_primary_handler(),如果中断触发类型是LEVEL,如果中断出发后不清中断容易引发中断风暴。提醒驱动开发者,没有primary handler且中断控制器不支持硬件oneshot,必须显式指定IRQF_ONESHOT表示位。
!(desc->irq_data.chip->flags & IRQCHIP_ONESHOT_SAFE)) { pr_err("Threaded irq requested with handler=NULL and !ONESHOT for irq %d\n",
irq);
ret = -EINVAL;
goto out_mask;
} if (!shared) {-------------------------------------------------非共享中断情况
ret = irq_request_resources(desc);
if (ret) {
pr_err("Failed to request resources for %s (irq %d) on irqchip %s\n",
new->name, irq, desc->irq_data.chip->name);
goto out_mask;
} init_waitqueue_head(&desc->wait_for_threads); /* Setup the type (level, edge polarity) if configured: */
if (new->flags & IRQF_TRIGGER_MASK) {
ret = __irq_set_trigger(desc, irq,-------------------调用gic_chip->irq_set_type设置中断触发类型。
new->flags & IRQF_TRIGGER_MASK); if (ret)
goto out_mask;
} desc->istate &= ~(IRQS_AUTODETECT | IRQS_SPURIOUS_DISABLED | \
IRQS_ONESHOT | IRQS_WAITING);
irqd_clear(&desc->irq_data, IRQD_IRQ_INPROGRESS);---------清IRQD_IRQ_INPROGRESS标志位 if (new->flags & IRQF_PERCPU) {
irqd_set(&desc->irq_data, IRQD_PER_CPU);
irq_settings_set_per_cpu(desc);
} if (new->flags & IRQF_ONESHOT)
desc->istate |= IRQS_ONESHOT; if (irq_settings_can_autoenable(desc))
irq_startup(desc, true);
else
/* Undo nested disables: */
desc->depth = ; /* Exclude IRQ from balancing if requested */
if (new->flags & IRQF_NOBALANCING) {
irq_settings_set_no_balancing(desc);
irqd_set(&desc->irq_data, IRQD_NO_BALANCING);
} /* Set default affinity mask once everything is setup */
setup_affinity(irq, desc, mask); } else if (new->flags & IRQF_TRIGGER_MASK) {
..
} new->irq = irq;
*old_ptr = new; irq_pm_install_action(desc, new); /* Reset broken irq detection when installing new handler */
desc->irq_count = ;
desc->irqs_unhandled = ; /*
* Check whether we disabled the irq via the spurious handler
* before. Reenable it and give it another chance.
*/
if (shared && (desc->istate & IRQS_SPURIOUS_DISABLED)) {
desc->istate &= ~IRQS_SPURIOUS_DISABLED;
__enable_irq(desc, irq);
} raw_spin_unlock_irqrestore(&desc->lock, flags); /*
* Strictly no need to wake it up, but hung_task complains
* when no hard interrupt wakes the thread up.
*/
if (new->thread)
wake_up_process(new->thread);------------------------------如果该中断被线程化,那么就唤醒该内核线程。这里每个中断对应一个线程。 register_irq_proc(irq, desc);----------------------------------创建/proc/irq/xxx/目录及其节点。
new->dir = NULL;
register_handler_proc(irq, new);-------------------------------以action->name创建目录
free_cpumask_var(mask); return ;
...
}

irq_setup_forced_threading()判断是否强制当前中断线程化,然后对thread_flags置位IRQTF_FORCED_THREAD表示此中断被强制线程化。

将原来的primary handler弄到中断线程中去执行,原来的primary handler换成irq_default_primary_handler。

并设置secondary的primary handler指向irq_forced_secondary_handler(),原来的thread_fn移到secondary的中线程中执行。

static int irq_setup_forced_threading(struct irqaction *new)
{
if (!force_irqthreads)---------------------------------------------如果内核启动参数包含threadirqs,则支持强制线程化。或者CONFIG_PREEMPT_RT_BASE实时补丁打开,这里也强制线程化。
return ;
if (new->flags & (IRQF_NO_THREAD | IRQF_PERCPU | IRQF_ONESHOT))----和线程化矛盾的标志位。
return ; new->flags |= IRQF_ONESHOT;----------------------------------------强制线程化的中断都置位IRQF_ONESHOT。 if (new->handler != irq_default_primary_handler && new->thread_fn) {
/* Allocate the secondary action */
new->secondary = kzalloc(sizeof(struct irqaction), GFP_KERNEL);
if (!new->secondary)
return -ENOMEM;
new->secondary->handler = irq_forced_secondary_handler;
new->secondary->thread_fn = new->thread_fn;
new->secondary->dev_id = new->dev_id;
new->secondary->irq = new->irq;
new->secondary->name = new->name;
}
/* Deal with the primary handler */
set_bit(IRQTF_FORCED_THREAD, &new->thread_flags);
new->thread_fn = new->handler;
new->handler = irq_default_primary_handler;
return ;
}

setup_irq()、request_threaded_irq()、request_irq()都是对__setup_irq()的包裹。

request_irq()调用request_threaded_irq(),只是少了thread_fn。

request_thraded_irq()和setup_irq()的区别在于,setup_irq()入参是struct irqaction ,而request_threaded_irq()在内部组装struct irqaction。

6. 一个中断的生命

经过上面的分析可以看出一个中断从产生、执行,到最终结束的流程。这里我们用树形代码路径来简要分析一下一个中断的生命周期。

vector_irq()->vector_irq()->__irq_svc()

->svc_entry()--------------------------------------------------------------------------保护中断现场

->irq_handler()->gic_handle_irq()------------------------------------------------具体到GIC中断控制器对应的就是gic_handle_irq(),此处从架构相关进入了GIC相关处理。

->GIC_CPU_INTACK--------------------------------------------------------------读取IAR寄存器,响应中断。

->handle_domain_irq()

->irq_enter()------------------------------------------------------------------------进入硬中断上下文

->generic_handle_irq()

->generic_handle_irq_desc()->handle_fasteoi_irq()--------------------根据中断号分辨不同类型的中断,对应不同处理函数,这里中断号取大于等于32。

->handle_irq_event()->handle_irq_event_percpu()

->action->handler()-----------------------------------------------------------对应到特定中断的处理函数,即上半部

->__irq_wake_thread()-----------------------------------------------------如果中断函数处理返回IRQ_WAKE_THREAD,则唤醒中断线程进行处理,但不是立即执行中断线程。

->irq_exit()---------------------------------------------------------------------------退出硬中断上下文。视情况处理软中断。

->invoke_softirq()-----------------------------------------------------------------处理软中断,超出一定条件任务就会交给软中断线程处理。

->GIC_CPU_EOI--------------------------------------------------------------------写EOI寄存器,表示结束中断。至此GIC才会接收新的硬件中断,此前一直是屏蔽硬件中断的。

->svc_exit-------------------------------------------------------------------------------恢复中断现场

从上面的分析可以看出:

  • 中断上半部的处理是关硬件中断的,这里的关硬件中断是GIC就不接收中断处理。直到写EOI之后,GIC仲裁单元才会重新选择中断进行处理。
  • 软中断运行于软中断上下文中,但是仍然是关硬件中断的,这里需要特别注意,软中断需要快速处理并且不能睡眠。
  • 不是所有软中断都运行于软中断上下文中,部分软中断任务可能会交给ksoftirqd线程处理。
  • 包括IRQ_WAKE_THREAD、ksoftirqd、woker等唤醒线程的情况,都不会在中断上下文中进行处理。中断上下文中所做的处理只是唤醒,执行时机交给系统调度。
  • 如果要提高Linux实时性,有两个要点:一是将上半部线程化;另一个是将软中断都交给ksoftirqd线程处理。