dnsmasq focal 2.80 NODATA instead of NXDOMAIN bug

Bug #1995260 reported by Maximilian Stinsky
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
dnsmasq (Ubuntu)
Fix Released
Undecided
Miriam España Acebal
Focal
Fix Released
Undecided
Miriam España Acebal

Bug Description

[SRU]

[ Impact ]

Sometimes dnsmasq is incorrectly returning NODATA instead of NXDOMAIN. This can lead to erroneous actions by clients who need to determine whether a domain name exists or not.

[ Test Plan ]

In a focal VM, install dnsmasq (apt install dnsmasq) if it wasn't installed yet.

#0 Disabling systemd-resolved service and enabling resolution through dnsmasq.

# systemctl disable --now systemd-resolved.service
# rm -f /etc/resolv.conf
# cat > /etc/resolv.conf << __EOF__
nameserver 8.8.8.8
__EOF__
# systemctl start dnsmasq.service

#1 Bad case

# for i in srv txt aaaa a aaaa a txt srv; do host -t $i test.foo. 127.0.0.1 | tail -n 1; done
Host test.foo. not found: 3(NXDOMAIN)
Host test.foo. not found: 3(NXDOMAIN)
Host test.foo. not found: 3(NXDOMAIN)
test.foo has no A record
Host test.foo. not found: 3(NXDOMAIN)
test.foo has no A record
test.foo has no TXT record
test.foo has no SRV record

#2 Good case

#2.1 Installing new package

# ls -1 *.deb
dnsmasq-utils_2.80-1.1ubuntu1.6_amd64.deb
dnsmasq-base_2.80-1.1ubuntu1.6_amd64.deb
dnsmasq_2.80-1.1ubuntu1.6_all.deb

# dpkg -i *.deb
(Reading database ... 32073 files and directories currently installed.)
Preparing to unpack dnsmasq-base_2.80-1.1ubuntu1.6_amd64.deb ...
Unpacking dnsmasq-base (2.80-1.1ubuntu1.6) over (2.80-1.1ubuntu1.5) ...
Selecting previously unselected package dnsmasq-utils.
Preparing to unpack dnsmasq-utils_2.80-1.1ubuntu1.6_amd64.deb ...
Unpacking dnsmasq-utils (2.80-1.1ubuntu1.6) ...
Preparing to unpack dnsmasq_2.80-1.1ubuntu1.6_all.deb ...
Unpacking dnsmasq (2.80-1.1ubuntu1.6) over (2.80-1.1ubuntu1.5) ...
Setting up dnsmasq-base (2.80-1.1ubuntu1.6) ...
Setting up dnsmasq-utils (2.80-1.1ubuntu1.6) ...
Setting up dnsmasq (2.80-1.1ubuntu1.6) ...
Processing triggers for dbus (1.12.16-2ubuntu2.3) ...
Processing triggers for man-db (2.9.1-1) ...
Processing triggers for systemd (245.4-4ubuntu3.18) ...

# dpkg -l | grep dnsmasq
ii dnsmasq 2.80-1.1ubuntu1.6 all Small caching DNS proxy and DHCP/TFTP server
ii dnsmasq-base 2.80-1.1ubuntu1.6 amd64 Small caching DNS proxy and DHCP/TFTP server
ii dnsmasq-utils 2.80-1.1ubuntu1.6 amd64 Utilities for manipulating DHCP leases

#2.2 Testing OK

# for i in srv txt aaaa a aaaa a txt srv; do host -t $i test.foo. 127.0.0.1 | tail -n 1; done
Host test.foo. not found: 3(NXDOMAIN)
Host test.foo. not found: 3(NXDOMAIN)
Host test.foo. not found: 3(NXDOMAIN)
Host test.foo. not found: 3(NXDOMAIN)
Host test.foo. not found: 3(NXDOMAIN)
Host test.foo. not found: 3(NXDOMAIN)
Host test.foo. not found: 3(NXDOMAIN)
Host test.foo. not found: 3(NXDOMAIN)

[ Where problems could occur ]

It changes the program's behaviour by classifying as NXDOMAIN what used to be NODATA in some situations, so if a user had a workaround for this (in the form of a script or other kind of automatization) it will probably start to malfunction.

The last rebuilding of the package for Focal was in May, so if any new dependencies or libs have been upgraded on this Ubuntu series this can impact the new rebuild.

[ Other Info ]

The patch is applied upstream and originated from a bug filed on Fedora side: https://bugzilla.redhat.com/show_bug.cgi?id=1674067

[Original Report]
---------------------------------------------------
We upgraded our openstack containers which host dnsmasq services from bionic to focal. With this we got an update of dnsmasq from 2.79 to 2.80 which introduced a bug in our setup where dnsmasq returns NODATA instead of NXDOMAIN.

This is already fixed upstream with the following commit [1].

The Ubuntu dnsmasq 2.80 package should get a backport with a release for the focal packages which includes this bug fix.

[1] https://thekelleys.org.uk/gitweb/?p=dnsmasq.git;a=commit;h=162e5e0062ce923c494cc64282f293f0ed64fc10

Related branches

Revision history for this message
Lucas Kanashiro (lucaskanashiro) wrote :

Thanks for taking the time to report this bug and trying to make Ubuntu better. I added a task for Focal, and marked the development release as Fix Released. Could you please provided detailed steps on how to reproduce this issue? We would need that if we decide to try to update Focal with the patch you mentioned.

I am setting the Focal task as Incomplete until you provide information to reproduce the bug, once you do that please set it back to New.

Changed in dnsmasq (Ubuntu):
status: New → Invalid
status: Invalid → Fix Released
Changed in dnsmasq (Ubuntu Focal):
status: New → Incomplete
Revision history for this message
Maximilian Stinsky (mstinsky) wrote :

Details on how to reproduce this are:
Install dnsmasq on ubuntu focal.

Start dnsmasq with for example: dnsmasq --server 8.8.8.8
Try to resolve hosts that are known to not exist on the authoritative name server that is used with dnsmasq.

for i in srv txt aaaa a aaaa a txt srv; do host -t $i test.foo. 127.0.0.1 | tail -n 1; done
Host test.foo. not found: 3(NXDOMAIN)
Host test.foo. not found: 3(NXDOMAIN)
Host test.foo. not found: 3(NXDOMAIN)
test.foo has no A record
Host test.foo. not found: 3(NXDOMAIN)
test.foo has no A record
test.foo has no TXT record
test.foo has no SRV record

Expected is that all tests are returning NXDOMAIN but they dont because of the bug.

Same test on a ubuntu jammy installation with dnsmasq 2.86:
for i in srv txt aaaa a aaaa a txt srv; do host -t $i test.foo. 127.0.0.1 | tail -n 1; done
Host test.foo. not found: 3(NXDOMAIN)
Host test.foo. not found: 3(NXDOMAIN)
Host test.foo. not found: 3(NXDOMAIN)
Host test.foo. not found: 3(NXDOMAIN)
Host test.foo. not found: 3(NXDOMAIN)
Host test.foo. not found: 3(NXDOMAIN)
Host test.foo. not found: 3(NXDOMAIN)
Host test.foo. not found: 3(NXDOMAIN)

The real world problem where we found this was with the autopath plugin from coredns in kubernetes. Because dnsmasq in version 2.80 sometimes returns NODATA the search path of autopath gets interrupted and the name resolution for normally working records fails.

Changed in dnsmasq (Ubuntu Focal):
status: Incomplete → New
Changed in dnsmasq (Ubuntu Focal):
status: New → Triaged
tags: added: bitesize server-todo
Revision history for this message
Athos Ribeiro (athos-ribeiro) wrote :

Thanks for the reproducer, Maximilian.

I added this to the server team backlog so someone can work on this one soon.

Changed in dnsmasq (Ubuntu):
assignee: nobody → Miriam España Acebal (mirespace)
Changed in dnsmasq (Ubuntu Focal):
assignee: nobody → Miriam España Acebal (mirespace)
description: updated
Changed in dnsmasq (Ubuntu Focal):
status: Triaged → In Progress
description: updated
description: updated
description: updated
Revision history for this message
Andreas Hasenack (ahasenack) wrote : Please test proposed package

Hello Maximilian, or anyone else affected,

Accepted dnsmasq into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/dnsmasq/2.80-1.1ubuntu1.6 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in dnsmasq (Ubuntu Focal):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-focal
Revision history for this message
Maximilian Stinsky (mstinsky) wrote :

I tested the patch inside a ubuntu focal vm and can verify that the bug is fixed.

Testing that has been done:
Enable proposed repository and install dnsmasq version 2.80-1.1ubuntu1.6:

cat <<EOF >/etc/apt/sources.list.d/ubuntu-$(lsb_release -cs)-proposed.list
# Enable Ubuntu proposed archive
deb http://archive.ubuntu.com/ubuntu/ $(lsb_release -cs)-proposed restricted main multiverse universe
EOF

apt update

apt install dnsmasq=2.80-1.1ubuntu1.6

dpkg -l | grep dnsmasq
ii dnsmasq 2.80-1.1ubuntu1.6 all Small caching DNS proxy and DHCP/TFTP server
ii dnsmasq-base 2.80-1.1ubuntu1.6 amd64 Small caching DNS proxy and DHCP/TFTP server

Stop systemd-resolved and start dnsmasq:
systemctl stop systemd-resolved.service
dnsmasq --server 8.8.8.8

Run the reproducer mentioned in this bug report:
for i in srv txt aaaa a aaaa a txt srv; do host -t $i test.foo. 127.0.0.1 | tail -n 1; done
Host test.foo. not found: 3(NXDOMAIN)
Host test.foo. not found: 3(NXDOMAIN)
Host test.foo. not found: 3(NXDOMAIN)
Host test.foo. not found: 3(NXDOMAIN)
Host test.foo. not found: 3(NXDOMAIN)
Host test.foo. not found: 3(NXDOMAIN)
Host test.foo. not found: 3(NXDOMAIN)
Host test.foo. not found: 3(NXDOMAIN)

All records are resolved with NXDOMAIN which shows the bug is fixed.

Thanks for the help and work Andreas and Miriam!

tags: added: verification-done-focal
removed: verification-needed-focal
Revision history for this message
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (dnsmasq/2.80-1.1ubuntu1.6)

All autopkgtests for the newly accepted dnsmasq (2.80-1.1ubuntu1.6) for focal have finished running.
The following regressions have been reported in tests triggered by the package:

ubuntu-fan/0.12.13ubuntu0.1 (arm64, s390x, amd64, ppc64el)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/focal/update_excuses.html#dnsmasq

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Revision history for this message
Miriam España Acebal (mirespace) wrote (last edit ):

That's weird, because it seems that systemd-resolve is not present in the lxd image:

lxd test: Waiting for addresses on eth0 ...
slave: detected primary route through eth0
/bin/sh: 29: systemd-resolve: not found
slave: waiting for systemd resolver...
/bin/sh: 29: systemd-resolve: not found

and the error is happening also with previous dnsmasq version:

Setting up dnsmasq-base (2.80-1.1ubuntu1.5) ...
ERROR: ld.so: object 'libeatmydata.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
Setting up ubuntu-fan (0.12.13ubuntu0.1) ...
[...]
FAIL: Error on LXD test

I put +xv to the fanatic script of the ubuntu-fan package to see how the image was created, and I got this:

+ storage_opt=--storage default
+ echo local lxd test: creating test container (Ubuntu:lts) ...
+ lxc launch ubuntu:lts fanatic-test --storage default -p fan-250
Creating fanatic-test
Starting fanatic-test

and creating it manually :

❯ lxc launch ubuntu:lts fanatic-test-lts

Creating fanatic-test-lts
Starting fanatic-test-lts
❯ lxc shell fanatic-test-lts
root@fanatic-test-lts:~#
root@fanatic-test-lts:~#
root@fanatic-test-lts:~#
root@fanatic-test-lts:~# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 22.04.1 LTS
Release: 22.04
Codename: jammy

It's jammy, not focal. And systemd-resolve is not present on Jammy:

root@fanatic-test-lts:~# systemd-
systemd-analyze systemd-escape systemd-run
systemd-ask-password systemd-hwdb systemd-socket-activate
systemd-cat systemd-id128 systemd-stdio-bridge
systemd-cgls systemd-inhibit systemd-sysext
systemd-cgtop systemd-machine-id-setup systemd-sysusers
systemd-cryptenroll systemd-mount systemd-tmpfiles
systemd-delta systemd-notify systemd-tty-ask-password-agent
systemd-detect-virt systemd-path systemd-umount
root@fanatic-test-lts:~# systemd-resolve
systemd-resolve: command not found
root@fanatic-test-lts:~#

the Ubunutu series is harcoded in the fanatic script at line 1015

#local series=$( lsb_release -sr )
local series='lts'

so it should be modified according to the series to which the package under test belongs (in this case, focal), but for Jammy and beyond the use of systemd-resolve in dns_lookup_forwarder() function at line 708 should be changed too.

Revision history for this message
Miriam España Acebal (mirespace) wrote (last edit ):

I switched the lines from the command above for the local series variable, and now the test passes:

autopkgtest [17:48:12]: test lxd: [-----------------------
II: Auto-init LXD...
qemu-system-x86_64: Slirp: external icmpv6 not supported yet
II: Creating Fan Bridge...
configuring fan underlay:10.0.0.0/16 overlay:250.0.0.0/8
II: Create LXD profile for Fan Bridge...
configuring LXD for underlay:10.0.0.0/16 overlay:250.0.0.0/8 (fan-250)
Profile fan-250 created
II: Test LXD...
master: detected primary route through ens3
master: DNS: systemd(10.0.2.3)
local lxd test: creating test container (Ubuntu:20.04) ...
Creating fanatic-test
Starting fanatic-test
lxd test: Waiting for addresses on eth0 ...
lxd test: Waiting for addresses on eth0 ...
lxd test: Waiting for addresses on eth0 ...
lxd test: Waiting for addresses on eth0 ...
slave: detected primary route through eth0
slave: waiting for systemd resolver...
sd_bus_open_system: No such file or directory
slave: DNS: systemd(250.2.15.1)
test master: ping test (250.2.15.202) ...
test slave: ping test (250.2.15.1) ...
test master: ping test ... PASS
test master: short data test (250.2.15.1 -> 250.2.15.202) ...
test slave: ping test ... PASS
test slave: short data test (250.2.15.202 -> 250.2.15.1) ...
test master: short data ... PASS
test master: long data test (250.2.15.1 -> 250.2.15.202) ...
test slave: short data ... PASS
test slave: long data test (250.2.15.202 -> 250.2.15.1) ...
test slave: long data ... PASS
test master: long data ... PASS
local lxd test: destroying test container ...
local lxd test: test complete PASS (master=0 slave=0)
II: Undefining LXD profile for Fan Bridge...
de-configuring LXD underlay:10.0.0.0/16 overlay:250.0.0.0/8
Profile fan-250 deleted
fan-250: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450
        inet 250.2.15.1 netmask 255.0.0.0 broadcast 0.0.0.0
        inet6 fe80::e493:42ff:fed3:34ec prefixlen 64 scopeid 0x20<link>
        ether e6:93:42:d3:34:ec txqueuelen 1000 (Ethernet)
        RX packets 5684 bytes 166505628 (166.5 MB)
        RX errors 0 dropped 0 overruns 0 frame 0
        TX packets 5922 bytes 167845841 (167.8 MB)
        TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

II: Removing Fan Bridge...
de-configuring fan underlay:10.0.0.0/16 overlay:250.0.0.0/8
PASS
autopkgtest [17:49:19]: test lxd: -----------------------]
autopkgtest [17:49:20]: test lxd: - - - - - - - - - - results - - - - - - - - - -
lxd PASS
autopkgtest [17:49:21]: @@@@@@@@@@@@@@@@@@@@ summary
command1 PASS
lxd PASS

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Great finding Miriam,
I've looked into it and fully agree.
Since I had all the data at that moment I filed bug 1998184 for ubuntu-fan.
Based on that we need to mask the tests and we can ignore them here in regard to this SRU (until fixed in ubuntu-fan).

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

@SRU team - please consider accepting and merging the test hint [1] to resolve the current blocker for this SRU.

[1]: https://code.launchpad.net/~paelzer/britney/+git/hints-ubuntu/+merge/433770

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

FYI proposed migration tests should be happy as soon as the migration-reference run for ubuntu-fan completed (but queues are long)

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package dnsmasq - 2.80-1.1ubuntu1.6

---------------
dnsmasq (2.80-1.1ubuntu1.6) focal; urgency=medium

  * src/cache.c: Apply 162e5e0062ce923c494cc64282f293f0ed64fc10 from
    upstream GIT to fix bug in DNS non-terminal code, added in 2.80,
    which could sometimes cause a NODATA rather than an NXDOMAIN
    reply (LP: #1995260).

 -- Miriam España Acebal <email address hidden> Tue, 15 Nov 2022 10:35:15 +0100

Changed in dnsmasq (Ubuntu Focal):
status: Fix Committed → Fix Released
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Update Released

The verification of the Stable Release Update for dnsmasq has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.