dns查询的秘密

dns早期是一个比较简单的协议,就是把domain翻译成ip地址。而且大家平时都是用udp协议,所以我们可以看到在stevens的《tcp/ip详解卷1》里就写写了十多页就结束了。

但是我们通过查询dns相关的rfc,https://rfc-annotations.research.icann.org/ 发现现在就基本的basic query的rfc就有74个。这些rfc都定义了些什么东西呢?

我们从实践出发,学习一下edns里一部分。就是如何使用client side的ip来获取真正的解析。同时还有一些疑问。

edns是使用rfc7871来定义的。 https://datatracker.ietf.org/doc/html/rfc7871

如果相对dns的报文有更多了解的话可以访问 https://root00r.github.io/protocol/ 作者整理了很多常见协议的报文内容。

dns query

常规的dns查询,大家应该都知道这个就是一个典型的递归查询。这个可以参考阮一峰老师的博文: https://www.ruanyifeng.com/blog/2022/08/dns-query.html

实际中特别是在cdn场景中,我们经常会碰到就近解析的问题。这样递归查询就会碰到一个问题,递归查询里上级dns server拿到的就是递归dns server的ip来进行解析的。

比如我这边明明是一个北京移动的用户,但是我查询 www.baidu.com 的ip地址,居然给我返回的是一个南京电信的ip。

而我们使用第三方拨测发现,北京移动正常返回的地址应该是两个北京移动的ip。

https://www.boce.com/dns/14db064306b7f51019b51917f617248d.html

那么这个时候问题就产生了,为什么会出现这种情况?

EDNS Client Subnet (ECS)

ecs这个rfc是Google的两个工程师在2016年提交的,其中一部分就是为了解决这种终端用户写错自己的dns server导致的用户来源匹配出错的情况。

在第六章里我们可以看到这个ecs的格式定义。还是比较简单的,其实就是在query的时候加了一个扩展,提交自己的subnet。

其实就在option code设置为8来表示ecs,然后还考虑ipv4和ipv6的区别。 但是在address部分我们看到这里定义的长度是可以完全表示我们终端用户的完整ip的。这里是否有安全问题? 暴露了用户隐私。

同时我们知道dns查询结果是有缓存的,这个缓存是根据域名来的,还是根据这个subnet来的。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
This protocol uses an EDNS0 [RFC6891] option to include client
address information in DNS messages. The option is structured as
follows:

+0 (MSB) +1 (LSB)
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
0: | OPTION-CODE |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
2: | OPTION-LENGTH |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
4: | FAMILY |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
6: | SOURCE PREFIX-LENGTH | SCOPE PREFIX-LENGTH |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
8: | ADDRESS... /
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

o (Defined in [RFC6891]) OPTION-CODE, 2 octets, for ECS is 8 (0x00
0x08).

o (Defined in [RFC6891]) OPTION-LENGTH, 2 octets, contains the
length of the payload (everything after OPTION-LENGTH) in octets.

o FAMILY, 2 octets, indicates the family of the address contained in
the option, using address family codes as assigned by IANA in
Address Family Numbers [Address_Family_Numbers].

The format of the address part depends on the value of FAMILY. This
document only defines the format for FAMILY 1 (IPv4) and FAMILY 2
(IPv6), which are as follows:

o SOURCE PREFIX-LENGTH, an unsigned octet representing the leftmost
number of significant bits of ADDRESS to be used for the lookup.
In responses, it mirrors the same value as in the queries.


o SCOPE PREFIX-LENGTH, an unsigned octet representing the leftmost
number of significant bits of ADDRESS that the response covers.
In queries, it MUST be set to 0.

o ADDRESS, variable number of octets, contains either an IPv4 or
IPv6 address, depending on FAMILY, which MUST be truncated to the
number of bits indicated by the SOURCE PREFIX-LENGTH field,
padding with 0 bits to pad to the end of the last octet needed.

o A server receiving an ECS option that uses either too few or too
many ADDRESS octets, or that has non-zero ADDRESS bits set beyond
SOURCE PREFIX-LENGTH, SHOULD return FORMERR to reject the packet,
as a signal to the software developer making the request to fix
their implementation.

查询安全

在7.1.1中定义了这个

In the usual case, where no ECS option was present in the client query, the Recursive Resolver initializes the option by setting FAMILY of the client’s address. It then uses the value of its maximum cacheable prefix length to set SOURCE PREFIX-LENGTH. For privacy reasons, and because the whole IP address is rarely required to determine a tailored response, this length SHOULD be shorter than the full address, as described in Section 11.

这里我们可以看到,这里传递的是客户端地址的网段,而不是完整的ip地址。一般来说大家都是遵循这套机制的。但是这里是否有耍流氓的行为呢。作者说放到了11.1隐私章节里了。

With the ECS option, the network address of the client that initiated the resolution becomes visible to all servers involved in the resolution process. Additionally, it will be visible from any network traversed by the DNS packets.
To protect users’ privacy, Recursive Resolvers are strongly encouraged to conceal part of the user’s IP address by truncating IPv4 addresses to 24 bits. 56 bits are recommended for IPv6, based on [RFC6177].
ISPs should have more detailed knowledge of their own networks. That is, they might know that all 24-bit prefixes in a /20 are in the same area. In those cases, for optimal cache utilization and improved privacy, the ISP’s Recursive Resolver SHOULD truncate IP addresses in this /20 to just 20 bits, instead of 24 as recommended above.

这里我们可以看到,我们提交查询的时候每一段网络都可以看到这个我们的扩展内容。为了保护用户隐私在rfc6177中我们定义了 ipv4 地址是截断到 /24, 比如你的ipv4地址是 1.1.1.1, 那查询的时候就是 1.1.1.0/24 的。而不是完整的ipv4地址。 而ipv6的地址是截断到 /56 的
同时为了让ISP的dns server更有效率,ISP的dns server针对ipv4的地址是阶段到 /20的,而不是 /24。不过这样肯定会导致很多小网段的dns解析出现异常。

dns缓存

好了,安全的问题解决了,我们来看另外一个疑问,递归的本地服务器上,这里缓存的解析是根据什么来的?是一个域名就存了一个解析,还是说根据client address存储的ip。从直观概念上来说肯定是根据client address来进行分类存储的。不然这个功能不就等于谁第一个查询就缓存谁的了吗? 那样要是第一个人是错误的,那后面的人查到的都是错误的。
可导致的结果是这些递归服务器的缓存会非常大,原先可能就存一条,现在估计要翻10倍以上了。

这里分了2个部分,一个是缓存应答,在7.3.1. Caching the Response里显示了。

In the cache, all resource records in the Answer section MUST be tied to the network specified in the response. The appropriate prefix length depends on the relationship between SOURCE PREFIX-LENGTH, SCOPE PREFIX-LENGTH, and the maximum cacheable prefix length configured for the cache.

这里我们可以到缓存结果是根据网段进行绑定的。同时作者也定义了一些其他规则,比如网段大小的包含情况等等。

这个世界不光有好人,还有坏人,所以从协议层面的安全保护也是必须的。这个看了下在该rfc 11.3的部分就做了一部分说明。

demo

国内现在第三方的dns server其实都支持ecs这个选项了。我们可以测试看看。我现在还是使用帝都移动的网络来进行测试。

查询重庆联通

查询辽宁电信