以太坊C++源码解析（四）RLPStream类

RLP是一种特殊的二进制编码解码方式，以太坊里数据包都是采用这种方式编码的，和传统的结构相比，RLP编码更节省空间，提高网络传输效率，缺点就是不太直观，这种编码解码原理介绍在下面这边文章里讲得很好，还附有python的实现代码：RLP编码和解码。但是在本文里我们来看看在C++里的实现。

RLPStream类和RLP类在libdevcore\RLP.h和RLP.cpp文件中，前者负责编码，后者负责解码。

RLPStream类

我们先来看看编码方式吧，RLP里面可以放两种数据，一种是字符串(Data)，另一种是列表(List)，字符串又分为单个字符和多个字符，我们一个一个来。

单个字符
RLP类提供了这些方法来编码单个字符：

RLPStream& append(unsigned _s) { return append(bigint(_s)); }
RLPStream& append(u160 _s) { return append(bigint(_s)); }
RLPStream& append(u256 _s) { return append(bigint(_s)); }
RLPStream& append(bigint _s);

可见单个字符先都会被转换为bigint，再进行转换。

RLPStream& RLPStream::append(bigint _i)
{
    if (!_i)
        m_out.push_back(c_rlpDataImmLenStart);
    else if (_i < c_rlpDataImmLenStart)
        m_out.push_back((byte)_i);
    else
    {
        unsigned br = bytesRequired(_i);
        if (br < c_rlpDataImmLenCount)
            m_out.push_back((byte)(br + c_rlpDataImmLenStart));
        else
        {
            auto brbr = bytesRequired(br);
            if (c_rlpDataIndLenZero + brbr > 0xff)
                BOOST_THROW_EXCEPTION(RLPException() << errinfo_comment("Number too large for RLP"));
            m_out.push_back((byte)(c_rlpDataIndLenZero + brbr));
            pushInt(br, brbr);
        }
        pushInt(_i, br);
    }
    noteAppended();
    return *this;
}

如果_i是0，则直接存入c_rlpDataImmLenStart，也就是0x80
如果_i小于0x80，则直接存入_i
否则计算存入_i所需要的字节数br，如果br小于c_rlpDataImmLenCount也就是8，则先存入br + 0x80，再调用pushInt(_i, br);存入_i
计算存入br所需的长度brbr，也就是_i长度的长度，将c_rlpDataIndLenZero + brbr存进去，即0xb7 + brbr，再调用pushInt(_i, br);存入_i
注意到后面还有个函数noteAppended();这个是用来处理列表长度的，后面再谈。

多个字符
保存多个字符有下列方法：

RLPStream& append(bytesConstRef _s, bool _compact = false);
RLPStream& append(bytes const& _s) { return append(bytesConstRef(&_s)); }
RLPStream& append(std::string const& _s) { return append(bytesConstRef(_s)); }
RLPStream& append(char const* _s) { return append(std::string(_s)); }
template <unsigned N> RLPStream& append(FixedHash<N> _s, bool _compact = false, bool _allOrNothing = false) { return _allOrNothing && !_s ? append(bytesConstRef()) : append(_s.ref(), _compact); }

方法都是先转换为字节数组bytesConstRef，然后再编码。

RLPStream& RLPStream::append(bytesConstRef _s, bool _compact)
{
    size_t s = _s.size();
    byte const* d = _s.data();
    if (_compact)
        for (size_t i = 0; i < _s.size() && !*d; ++i, --s, ++d) {}

    if (s == 1 && *d < c_rlpDataImmLenStart)
        m_out.push_back(*d);
    else
    {
        if (s < c_rlpDataImmLenCount)
            m_out.push_back((byte)(s + c_rlpDataImmLenStart));
        else
            pushCount(s, c_rlpDataIndLenZero);
        appendRaw(bytesConstRef(d, s), 0);
    }
    noteAppended();
    return *this;
}

如果是压缩模式，则忽略数组_s开头的0
如果数组只有单个字符，并且该字符小于0x80，则直接存入该字符
否则如果数组长度小于0x38，也就是56，则先存入s + 0x80，再调用appendRaw(bytesConstRef(d, s), 0);存入数组本身
如果数组长度大于等于56，则先存入数组长度s的长度+0xb7，再存入数组长度s，再调用appendRaw(bytesConstRef(d, s), 0);存入数组本身
也调用了noteAppended();

列表
存入列表有这些方法：

template <class _T> RLPStream& append(std::vector<_T> const& _s) { return appendVector(_s); }
template <class _T> RLPStream& appendVector(std::vector<_T> const& _s) { appendList(_s.size()); for (auto const& i: _s) append(i); return *this; }
template <class _T, size_t S> RLPStream& append(std::array<_T, S> const& _s) { appendList(_s.size()); for (auto const& i: _s) append(i); return *this; }
template <class _T> RLPStream& append(std::set<_T> const& _s) { appendList(_s.size()); for (auto const& i: _s) append(i); return *this; }
template <class _T> RLPStream& append(std::unordered_set<_T> const& _s) { appendList(_s.size()); for (auto const& i: _s) append(i); return *this; }
template <class T, class U> RLPStream& append(std::pair<T, U> const& _s) { appendList(2); append(_s.first); append(_s.second); return *this; }

我们选这个来看：

1	template <class _T> RLPStream& appendVector(std::vector<_T> const& _s) { appendList(_s.size()); for (auto const& i: _s) append(i); return *this; }

可见先是调用appendList(_s.size())存入列表大小，然后依次存入列表元素。
但是当我们深入appendList()函数里时，发现不是这么回事：

RLPStream& RLPStream::appendList(size_t _items)
{
    if (_items)
        m_listStack.push_back(std::make_pair(_items, m_out.size()));
    else
        appendList(bytes());
    return *this;
}

这里只是将列表元素个数和当前输出缓冲区大小放入了一个列表栈m_listStack里，并没有存入任何东西。那列表长度是什么时候放进m_out的呢？答案就是前面提到的noteAppended()函数。
noteAppended()函数代码为：

void RLPStream::noteAppended(size_t _itemCount)
{
    if (!_itemCount)
        return;

    while (m_listStack.size())
    {
        /// ...
        m_listStack.back().first -= _itemCount;
        if (m_listStack.back().first)
            break;
        else
        {
            auto p = m_listStack.back().second;
            m_listStack.pop_back();
            size_t s = m_out.size() - p;		// list size
            auto brs = bytesRequired(s);
            unsigned encodeSize = s < c_rlpListImmLenCount ? 1 : (1 + brs);
            auto os = m_out.size();
            m_out.resize(os + encodeSize);
            memmove(m_out.data() + p + encodeSize, m_out.data() + p, os - p);
            if (s < c_rlpListImmLenCount)
                m_out[p] = (byte)(c_rlpListStart + s);
            else if (c_rlpListIndLenZero + brs <= 0xff)
            {
                m_out[p] = (byte)(c_rlpListIndLenZero + brs);
                byte* b = &(m_out[p + brs]);
                for (; s; s >>= 8)
                    *(b--) = (byte)s;
            }
        }
        _itemCount = 1;	// for all following iterations, we've effectively appended a single item only since we completed a list.
    }
}

m_out中每存入一个字符或者一个字符串都要调用一次这个函数，这个函数是从后向前遍历m_listStack。

判断当前列表是否全部存完，如果没有存完则退出：

1 2	if (m_listStack.back().first) break;

否则先计算列表成员在m_out中实际占用空间的大小：

1	size_t s = m_out.size() - p;

再计算出要存入这个s所需空间的大小encodeSize。
将数组整体向后移动encodeSize个空间

1	memmove(m_out.data() + p + encodeSize, m_out.data() + p, os - p);

将’s’编码并填进空位中。

RLPStream类将数据编码进m_out以后，就可以通过swapOut()函数导出为字节数组了。

# 以太坊 C++ 源码

以太坊C++源码解析（四）RLPStream类

RLPStream类

喜欢这篇文章？打赏一下作者吧

分类

最新文章

标签

Your browser is out-of-date!