R-Car/Merging-MMC-block-requests
< R-Car
Contents
Overview
Linux kernel v5.4 supports merging MMC block requests by using IOMMU feature. If we enable this feature on R-Car Gen3, MMC read/write performance can be improved.
Without this feature
- Linux block layer cannot merge any bio.
- Since R-Car Gen3 SDHI's max_segs is 1, Linux block layer only prepare one sg.
With this feature
- Linux block layer can merge bio if filesystem data is aligned with IOMMU (e.g. 4KB page).
- Even if R-Car Gen3 SDHI's max_segs is 1, this feature can map continuous pages if the data is aligned.
- So, since overhead which issuing read/write commands can be reduced than without this feature, MMC read/write performance can be improved.
Remarks
- Even if we enable this feature, Linux block layer cannot merge bio if filesystem data is not aligned (e.g. ext4 with small partition and -J size=1).
Related commits
158a6d3 iommu/dma: add a new dma_map_ops of get_merge_boundary() 6ba9941 dma-mapping: introduce dma_get_merge_boundary() 38c38cb mmc: queue: use bigger segments if DMA MAP layer can merge the segments 45147fb block: add a helper function to merge the segments
How to enable the feature on R-Car Gen3?
Kernel configuration
We need to enable the following kernel configuration.
CONFIG_MMC=y CONFIG_MMC_BLOCK=y CONFIG_MMC_SDHI=y CONFIG_MMC_SDHI_INTERNAL_DMAC=y CONFIG_IOMMU_SUPPORT=y CONFIG_IPMMU_VMSA=y
Modify a driver
We need to modify drivers/iommu/ipmmu-vmsa.c like below.
v5.8 or earlier
static const char * const rcar_gen3_slave_whitelist[] = { "ee100000.sd", "ee120000.sd", "ee140000.sd", "ee160000.sd", };
v5.9 or later
static const char * const rcar_gen3_slave_whitelist[] = { "ee100000.mmc", "ee120000.mmc", "ee140000.mmc", "ee160000.mmc", };
How to confirm?
After the kernel booted, we can confirm whether the feature is enabled via sysfs.
# ls /sys/kernel/iommu_groups/0/devices/ ee100000.mmc ee140000.mmc ee160000.mmc
Performance measurement
The following table is a performance measurement on v5.1-rc6[1].
environment | Sequential Output (KB/sec) | Sequential Input (KB/sec) |
---|---|---|
H3 without this feature | 117,133 | 118,682 |
H3 with this feature | 130,482 | 195,727 |