cpu 访问外设的方法

Distributions

› Ubuntu

› Fedora

› CentOS

中文资源站

› 网易开源镜像站

This topic created in 1152 days ago, the information mentioned may be changed or developed.

目前我知道有两种：Memory-mapped I/O and port-mapped I/O 。参考 wiki [1].

但我又参考了另外一片文章[2]：

内存映射有些体系结构的 CPU （如，PowerPC 、m68k 等）通常只实现一个物理地址空间（ RAM ）。在这种情况下，外设 I/O 端口的物理地址就被映射到 CPU 的单一物理地址空间中，而成为存储空间的一部分。此时，CPU 可以象访问一个内存单元那样访问外设 I/O 端口，而不需要设立专门的外设 I/O 指令。这就是所谓的“存储空间映射方式”（ Memory － mapped ）。ARM 体系的 CPU 均采用这一模式.

为了验证这篇文章的说法“CPU 可以象访问一个内存单元那样访问外设 I/O 端口，而不需要设立专门的外设 I/O 指令”。我查了了一下 linux 驱动的代码。发现使用 memory-mapped I/O 的驱动访问外设是使用 readb/readw/readl 这样的接口（ write 类似）。代码定义在 arch/arm/include/asm/io.h (与平台无关）

/*
 *  Memory access primitives
 *  ------------------------
 *
 * These perform PCI memory accesses via an ioremap region.  They don't
 * take an address as such, but a cookie.
 *
 * Again, these are defined to perform little endian accesses.  See the
 * IO port primitives for more information.
 */
#ifndef readl
#define readb_relaxed(c) ({ u8  __r = __raw_readb(c); __r; })
#define readw_relaxed(c) ({ u16 __r = le16_to_cpu((__force __le16) \
					__raw_readw(c)); __r; })
#define readl_relaxed(c) ({ u32 __r = le32_to_cpu((__force __le32) \
					__raw_readl(c)); __r; })

#define writeb_relaxed(v,c)	__raw_writeb(v,c)
#define writew_relaxed(v,c)	__raw_writew((__force u16) cpu_to_le16(v),c)
#define writel_relaxed(v,c)	__raw_writel((__force u32) cpu_to_le32(v),c)

#define readb(c)		({ u8  __v = readb_relaxed(c); __iormb(); __v; })
#define readw(c)		({ u16 __v = readw_relaxed(c); __iormb(); __v; })
#define readl(c)		({ u32 __v = readl_relaxed(c); __iormb(); __v; })

拿 readb 分析，发现它最终会调用__raw_readb 。__raw_readb 应该是与平台实现有关的，比如

(1)arm

定义在 arch/arm/include/asm/io.h

static inline u8 __raw_readb(const volatile void __iomem *addr)
{
	u8 val;
	asm volatile("ldrb %0, %1"
		     : "=r" (val)
		     : "Qo" (*(volatile u8 __force *)addr));
	return val;
}

这里使用了汇编指令 ldrb （ Load Register Byte (register)）来读值。如此看来，在 arm 平台用 mmio 访问是不是有自己的指令呢？

(2)powerpc

定义在 arch/powerpc/include/asm/io.h 。不管是有没有使用 Indirect IO address tokens ，还是直接访问传过来的地址。都没有像 arm 那样通过一条汇编指令来访问。powerpc 看起来是符合这篇文章的说法的。

*
 * When CONFIG_PPC_INDIRECT_MMIO is set, the platform can provide hooks
 * on all MMIOs. (Note that this is all 64 bits only for now)
 *
 * To help platforms who may need to differentiate MMIO addresses in
 * their hooks, a bitfield is reserved for use by the platform near the
 * top of MMIO addresses (not PIO, those have to cope the hard way).
 *
 * The highest address in the kernel virtual space are:
 *
 *  d0003fffffffffff	# with Hash MMU
 *  c00fffffffffffff	# with Radix MMU
 *
 * The top 4 bits are reserved as the region ID on hash, leaving us 8 bits
 * that can be used for the field.
 *
 * The direct IO mapping operations will then mask off those bits
 * before doing the actual access, though that only happen when
 * CONFIG_PPC_INDIRECT_MMIO is set, thus be careful when you use that
 * mechanism
 *
 * For PIO, there is a separate CONFIG_PPC_INDIRECT_PIO which makes
 * all PIO functions call through a hook.
 */

#ifdef CONFIG_PPC_INDIRECT_MMIO
#define PCI_IO_IND_TOKEN_SHIFT	52
#define PCI_IO_IND_TOKEN_MASK	(0xfful << PCI_IO_IND_TOKEN_SHIFT)
#define PCI_FIX_ADDR(addr)						\
	((PCI_IO_ADDR)(((unsigned long)(addr)) & ~PCI_IO_IND_TOKEN_MASK))
#define PCI_GET_ADDR_TOKEN(addr)					\
	(((unsigned long)(addr) & PCI_IO_IND_TOKEN_MASK) >> 		\
		PCI_IO_IND_TOKEN_SHIFT)
#define PCI_SET_ADDR_TOKEN(addr, token) 				\
do {									\
	unsigned long __a = (unsigned long)(addr);			\
	__a &= ~PCI_IO_IND_TOKEN_MASK;					\
	__a |= ((unsigned long)(token)) << PCI_IO_IND_TOKEN_SHIFT;	\
	(addr) = (void __iomem *)__a;					\
} while(0)
#else
#define PCI_FIX_ADDR(addr) (addr)
#endif


/*
 * Non ordered and non-swapping "raw" accessors
 */

static inline unsigned char __raw_readb(const volatile void __iomem *addr)
{
	return *(volatile unsigned char __force *)PCI_FIX_ADDR(addr);
}

请问 v 友怎么看待这个问题呢？难道是某块代码我看错或者理解错了？

[1]： https://en.wikipedia.org/wiki/Memory-mapped_I/O_and_port-mapped_I/O

[2]： http://blog.chinaunix.net/uid-30035173-id-4714589.html

Supplement 1 · Apr 24, 2023

@roycestevie6761
@leonshaw
@klwha
@rapiz

Hi All,

我查了一下 arm 汇编，ldrb 这个指令应该是既可以把外设寄存器的值读入 cpu 的寄存器，也可以把 DRAM(主存）读入到 cpu 的寄存器。所以本质上还是共享地址空间。符合 wiki 的定义:
Memory-mapped I/O uses the same address space to address both main memory and I/O devices. The memory and registers of the I/O devices are mapped to (associated with) address values. So a memory address may refer to either a portion of physical RAM, or instead to memory and registers of the I/O device

这里就不知道为什么 arm 使用汇编来操控外设寄存器了。而 powerpc 看起来是使用 C 语言操控。有 performance 考量？

12 replies • 2023-04-26 19:26:59 +08:00

roycestevie6761

Apr 24, 2023

“CPU 可以象访问一个内存单元那样访问外设 I/O 端口，而不需要设立专门的外设 I/O 指令”

如果你搞过嵌入式或者单片机，就很容易理解，
https://github.com/search?q=GPIO_BASE&type=code
所有的外设寄存器其实就是地址，至于这个地址是多少，可以参考芯片的手册。
读写这部分地址就可以操控外设行为。
至于单独的 io 指令，这部分 io 对应的地址就设计到芯片里面了。

你发的代码我没细看，链接我也没看，单从嵌入式的开发经验，就这么解释了。如果有错，指出来就行

roycestevie6761

Apr 24, 2023

最后一个链接其实讲的差不多，但是他是站在单片机的角度下分析的，你是站在现代 CPU 下面分析的，mmio 就是用部分内存当寄存器了，所有架构下面都一样的

leonshaw

Apr 24, 2023 via Android

ldrb 不是读一字节内存吗？

huangya

Apr 24, 2023

@roycestevie6761 我还是有点困惑。希望能从实际的驱动代码分析。这个链接 http://www.embeddedlinux.org.cn/emb-linux/system-development/201710/13-7532.html 说了 access 有不同的办法。

readX/writeX() are used to access memory mapped devices. On some

* architectures the memory mapped IO stuff needs to be accessed

* differently. On the simple architectures, we just read/write the

* memory location directly.

writel() 往内存映射的 I/O 空间上写数据，wirtel() I/O 上写入 32 位数据 (4 字节)。

huangya

Apr 24, 2023

@leonshaw 啊？我的理解是从外设的寄存器里面读取值到 cpu 的寄存器。
参考:https://developer.arm.com/documentation/ddi0406/cb/Application-Level-Architecture/Instruction-Details/Alphabetical-list-of-instructions/LDRB--register-

<Rt>

The destination register.
<Rn>

The base register. The SP can be used. In the ARM instruction set the PC can be used, for the offset addressing form of the instruction only. In the Thumb instruction set, the PC cannot be used with any of these forms of the LDRB instruction.
+/-

Is + or omitted if the optionally shifted value of <Rm> is to be added to the base register value (add == TRUE, encoded as U == 1 in encoding A1), or - if it is to be subtracted (permitted in ARM instructions only, add == FALSE, encoded as U == 0).
<Rm>

Contains the offset that is optionally shifted and applied to the value of <Rn> to form the address.

klwha

Apr 24, 2023 via Android

@huangya 大概似乎是这样的，其实就是外设寄存器可以映射成某个内存地址，实际实现不同板子不一样

roycestevie6761

Apr 24, 2023

@huangya 你发的代码其实用处不大，而且是个片段，而且没有解释内存映射到底是什么。
建议直接看 intel 或者 amd 芯片手册。举个例子，x86cpu 下面有个寄存器是 APIC 相关的

https://stackoverflow.com/questions/51966947/can-different-cpus-on-an-x86-machine-can-have-different-local-apic-register-mmio

这个寄存器就是映射到某个内核地址的，因为高内核地址用户层没有读写权限，所以要驱动来读写。
来看看操作系统是怎么来搞的。

写 APIC 寄存器的操作
https://github.com/tongzx/nt5src/blob/daad8a087a4e75422ec96b7911f1df4669989611/Source/XPSP1/NT/drivers/wdm/rt/exec/apic.c#L393

下面这个链接直接解释了写寄存器就是往特定地址写数，也就是 mmio 的本质
https://github.com/tongzx/nt5src/blob/daad8a087a4e75422ec96b7911f1df4669989611/Source/XPSP1/NT/drivers/wdm/rt/exec/apic.h#L232

roycestevie6761

Apr 24, 2023

我发的下面 2 个链接是 windows xp 源代码，你发的 linux 那个太晦涩了

rapiz

Apr 24, 2023

举一个 riscv （是 mmio ）的例子，里面访问外设和读一个内存地址的指令是一模一样的

比如你读取内存 0x8000000 处的一个字节，指令是 lb(load byte) 0x8000000
如果你读取一个 timer ，它的一个一字节长的寄存器，被映射到内存空间的 0x10000, 那读取它的指令就是 lb 0x10000 。没有任何区别

这里的硬件实现，是在 cpu 访存时，地址译码的时候，根据地址落到的范围的不同，路由到不同的硬件组件，再将结果返回给 cpu 。

artnowben

Apr 24, 2023

以网卡为例，DPDK 在用户态就可以操作网卡，是因为我们可以把网卡的寄存器映射到了一些内存地址，我们可以用读写指令操作这些地址，就等于操作这些寄存器了；这给软件开发者带了了便利。
https://github.com/baidu/dperf 是一个 DPDK 生态项目，可以去调试一下。

MstMoonshine

Apr 26, 2023 via iPad

“像操作 memory 一样操作 I/O device”是指可以通过访问 memory address 的方式去操作 device register ，DRAM 和 MMIO devices 都在一个 memory bus 上。

但是这不代表真正项目中要直接把两者等同，因为这两种 memory 性质有很大差别。比如 device register 可能会频繁被硬件修改，每次都需要 invalidate cache 重新读取；又比如对外设而言，两条看似无关的 memory 操作其实可能是有 dependency 的，要加 fence 指令确保不会被 Out of Order 执行。因此还是需要一些 readb()这样的借口来特殊处理 MMIO 。

huangya

Apr 26, 2023

@MstMoonshine 感谢回复，我感觉你说的加"fence 指令“这些是 readb 中调用的额外指令。但是核心的读指令（我们假设这条指令为 r(x)，x 为地址）在读 main memory 和 device 的寄存器的时候是一样的。因为它们是在同一个地址空间（在硬件的角度看，我想应该就是你说的同一个 memory bus ）。我实践过 port-mapped I/O 的设备，因为不在同一个地址空间，所以需要额外的指令操作外设。在 x86 上是使用单独的指令 in(y)和 out(y)操作的。