一次向linux開源社區提交補丁的經歷

背景

在開發過程中,偶然發現了spinand驅動的一個bug,滿懷欣喜地往社區提補丁。這是怎麼樣的一個bug呢?

static int spinand_mtd_read(struct mtd_info *mtd, loff_t from,
                            struct mtd_oob_ops *ops)
{
        ......
        nanddev_io_for_each_page(nand, from, ops, &iter) {
                ......
                ret = spinand_read_page(spinand, &iter.req, enable_ecc);
                if (ret < 0 && ret != -EBADMSG)     /* 讀取數據出錯 */
                        break;

                if (ret == -EBADMSG) {
                        /* -EBADMSG 返回表示壞塊 */
                        ecc_failed = true;
                        mtd->ecc_stats.failed++;
                        ret = 0;
                } else {
                        /* 出現位翻轉或者讀取正常,則記錄歷史位翻轉最大值 */
                        mtd->ecc_stats.corrected += ret;
                        max_bitflips = max_t(unsigned int, max_bitflips, ret);
                }

                ops->retlen += iter.req.datalen; 
                ops->oobretlen += iter.req.ooblen;
        }

        if (ecc_failed && !ret)
                ret = -EBADMSG;

        return ret ? ret : max_bitflips;
}

代碼邏輯如下:

  1. 遍歷讀取每一個page
  2. 如果讀出錯則直接返回
  3. 如果出現壞塊,則置位ecc_failed,在函數最後會檢查此標誌
  4. 如果出現位翻轉,則暫存最大位翻轉的bit位數量
  5. 全部讀取完后,如果有置位ecc_failed,則返回壞塊錯誤碼;如果出現位翻轉,則返回最大位翻轉;否則返回0,表示正常

問題出在於,如果剛好最後一次讀取出現位翻轉,此時ret != 0就直接退出循環,此時會導致壞塊標識無效,且返回最後的位翻轉量而非歷史位翻轉最大值。這是代碼不嚴謹的地方。

第一次提交

修改補丁如下,補丁邏輯不再解釋。

In function spinand_mtd_read, if the last page to read occurs bitflip,
this function will return error value because veriable ret not equal to 0.

Signed-off-by: liaoweixiong <liaoweixiong@allwinnertech.com>
---
 drivers/mtd/nand/spi/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mtd/nand/spi/core.c b/drivers/mtd/nand/spi/core.c
index 556bfdb..6b9388d 100644
--- a/drivers/mtd/nand/spi/core.c
+++ b/drivers/mtd/nand/spi/core.c
@@ -511,12 +511,12 @@ static int spinand_mtd_read(struct mtd_info *mtd, loff_t from,
        if (ret == -EBADMSG) {
            ecc_failed = true;
            mtd->ecc_stats.failed++;
-           ret = 0;
        } else {
            mtd->ecc_stats.corrected += ret;
            max_bitflips = max_t(unsigned int, max_bitflips, ret);
        }
 
+       ret = 0;
        ops->retlen += iter.req.datalen;
        ops->oobretlen += iter.req.ooblen;
    }

21:13分發出的郵件,21:45分陸續收到兩個回復:

<maintainer A>:

Actually, that's exactly what the MTD core expects (see [1]), so you're
the one introducing a regression here.
<maintainer B>:

To me it looks like the patch description is somewhat incorrect, but the 
fix itself looks okay, unless I'm getting it wrong.

In case of the last page containing bitflips (ret > 0), 
spinand_mtd_read() will return that number of bitflips for the last 
page. But to me it looks like it should instead return max_bitflips like 
it does when the last page read returns with 0.

以及隔天回復

<maintainer A>:

Oh, you're right. liaoweixiong, can you adjust the commit message
accordingly?

好吧,問題出在與我沒把問題描述清楚,改改再提交

第二次提交

只改了comment和補丁標題:

Subject: [PATCH v2] mtd: spinand: read return badly if the last page has bitflips

In case of the last page containing bitflips (ret > 0), 
spinand_mtd_read() will return that number of bitflips for the last 
page. But to me it looks like it should instead return max_bitflips like 
it does when the last page read returns with 0.

然後嘩啦啦收到兩個Reviewed-by,附帶一個建議:

Reviewed-by: <maintainer B>

This should probably be resent with the following tags:

Cc: stable@vger.kernel.org
Fixes: 7529df465248 ("mtd: nand: Add core infrastructure to support SPI 
NANDs")

得,再提交一次吧

第三次提交

此時的我提交補丁到社區經驗並不多,Maintainer讓我resend,我就忐忑開始胡思亂想了:

版本號需要累加么?該怎麼標記是重新發送?有兩個maintainer已經”認可”了我的補丁(reviewed-by),我改怎麼體現到新的郵件中?

仔細想想內容並沒改,因此不需要累加版本號;查詢前人提交,在郵件標題可以加上RESEND字樣;搜索含RESEND字樣的前人郵件,剛好找到一個在maintainer reviewed后resend為acked,寫在signed-off-by區。

OK,確定下來就重新發吧

Subject: [RESEND PATCH v2] mtd: spinand: read return badly if the last page has bitflips

......
Signed-off-by: liaoweixiong <liaoweixiong@allwinnertech.com>
Acked-by: <maintainer A>
Acked-by: <maintainer B>
Fixes: 7529df465248 ("mtd: nand: Add core infrastructure to support SPI NANDs")

很快,就挨批了…

第四次提交

晚上10點多,收到回復:

<maintainer B>

Why did you change our Reviewed-by tags to Acked-by tags?

額…我也是看別人這麼做我才這麼做的,大佬生氣了!趕緊補救

......
Reviewed-by: <maintainer A>
Reviewed-by: <maintainer B>
Fixes: 7529df465248 ("mtd: nand: Add core infrastructure to support SPI NANDs")

第五次提交

埋下的坑終究是要踩的,很快,再次挨批了

<maintainer C>

This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read:
    https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.
<maintainer A>

FYI, you should not send the patch to stable@vger.kernel.org, but 
instead, as I said in my other reply, add the tag "Cc: 
stable@vger.kernel.org". See "Option 1" in the document Greg referred to.

小白趕緊狠補基礎操作規範…

第六次提交

......
Reviewed-by: <maintainer A>
Reviewed-by: <maintainer B>
Cc: stable@vger.kernel.org
Fixes: 7529df465248 ("mtd: nand: Add core infrastructure to support SPI NANDs")

總結

哎,我只是挪了一行代碼的位置而已啊,Maintainer嚴審下,我竟然提交了6次!6次!突然感覺心好累。

累歸累,問題總結還是需要的

  1. 新手不具備提交代碼的一些常識,包括 a) 提交中各個tag的含義,在什麼時候加這些tag,例如Reviewed-by和Acked-by的差別 b) 提交補丁到stable的注意事項
  2. 對補丁的問題描述不夠仔細清楚,導致 無法理解,幸好 幫我澄清了

解決方法:

  1. linux提交有規範文檔的,抽時間擼一遍,並翻譯發博客
  2. 在發補丁之前,讓身邊的人幫忙看一下補丁說明是否足夠清晰

希望我的經歷能幫助到正在或者準備向Linux內核開源社區的小夥伴

【精選推薦文章】

如何讓商品強力曝光呢? 網頁設計公司幫您建置最吸引人的網站,提高曝光率!!

想要讓你的商品在網路上成為最夯、最多人討論的話題?

網頁設計公司推薦更多不同的設計風格,搶佔消費者視覺第一線

不管是台北網頁設計公司台中網頁設計公司,全省皆有專員為您服務

想知道最厲害的台北網頁設計公司推薦台中網頁設計公司推薦專業設計師"嚨底家"!!

您可能也會喜歡…