Discussion:
Error: failed to begin transaction: database is locked
(too old to reply)
Kees Bakker
2018-09-12 14:33:23 UTC
Permalink
Hey,

This with a LXD/LXC on a Ubuntu 18.04 server. Storage is done
with LVM. It was installed as a cluster with just one node.
It was also added as remote for three other LXD servers (all Ubuntu 16.04
and LXD 2.0.x). These old servers have BTRFS storage.

Suddenly I cannot do any lxc command anymore. They all give

    Error: failed to begin transaction: database is locked

In /var/log/lxd/lxd.log it prints the following message every 10 seconds

    lvl=warn msg="Failed to get current raft nodes: failed to fetch raft server address: failed to begin transaction: database is locked" t=2018-09-12T16:28:44+0200

Extra information. This afternoon I have upgraded one of the "old" servers
to LXD 3.0 (from xenial-backports). This was triggered by the problems we
have with a container in ERROR state and a kworker at 100% cpu load.
--
Kees
Fajar A. Nugraha
2018-09-13 03:57:06 UTC
Permalink
Post by Kees Bakker
Hey,
This with a LXD/LXC on a Ubuntu 18.04 server. Storage is done
with LVM. It was installed as a cluster with just one node.
It was also added as remote for three other LXD servers (all Ubuntu 16.04
and LXD 2.0.x). These old servers have BTRFS storage.
Only added as remote? not lxd clustering (
https://lxd.readthedocs.io/en/latest/clustering/)?
Post by Kees Bakker
Suddenly I cannot do any lxc command anymore. They all give
Error: failed to begin transaction: database is locked
In /var/log/lxd/lxd.log it prints the following message every 10 seconds
lvl=warn msg="Failed to get current raft nodes: failed to fetch raft
server address: failed to begin transaction: database is locked"
t=2018-09-12T16:28:44+0200
Extra information. This afternoon I have upgraded one of the "old" servers
to LXD 3.0 (from xenial-backports). This was triggered by the problems we
have with a container in ERROR state and a kworker at 100% cpu load.
Do package versions on upgraded servers match? i.e. all lxd, liblxc1, etc
all 3.0 from xenial-backports, without any 2.x or ppa packages mixed in?

Have you restart lxd on the upgraded server?

If you temporarily move ~/.config/lxc somehere else (to "remove" all the
remotes, among other things), does lxc command work?
--
Fajar
Kees Bakker
2018-09-13 07:12:27 UTC
Permalink
Post by Kees Bakker
Hey,
This with a LXD/LXC on a Ubuntu 18.04 server. Storage is done
with LVM. It was installed as a cluster with just one node.
It was also added as remote for three other LXD servers (all Ubuntu 16.04
and LXD 2.0.x). These old servers have BTRFS storage.
Only added as remote? not lxd clustering (https://lxd.readthedocs.io/en/latest/clustering/)?
Yes, only as remote.
Post by Kees Bakker
 
Suddenly I cannot do any lxc command anymore. They all give
    Error: failed to begin transaction: database is locked
In /var/log/lxd/lxd.log it prints the following message every 10 seconds
    lvl=warn msg="Failed to get current raft nodes: failed to fetch raft server address: failed to begin transaction: database is locked" t=2018-09-12T16:28:44+0200
Extra information. This afternoon I have upgraded one of the "old" servers
to LXD 3.0 (from xenial-backports). This was triggered by the problems we
have with a container in ERROR state and a kworker at 100% cpu load.
Do package versions on upgraded servers match? i.e. all lxd, liblxc1, etc all 3.0 from xenial-backports, without any 2.x or ppa packages mixed in?
Have you restart lxd on the upgraded server?
Not manually, no. The upgrade wasn't totally smooth. It ran into a timeout setting
up some lxc network config.

Then did a reboot, and the shutdown was hanging for something with ebtables (new
package because of the move to 3.0). I forced a powerdown and luckily the server
came up normal.

After that I noticed the problem described above. Restarting the lxd server
solve it, and it is back to normal. (( I didn't know for sure that the LXD server can
be restarted without killing the containers. But it worked. ))

Here are a few lines from lxd.log at the time it started giving the problem.

lvl=info msg="Raft: Snapshot to 597621 complete" t=2018-09-12T15:47:01+0200
lvl=info msg="Raft: Starting snapshot up to 597696" t=2018-09-12T15:52:15+0200
lvl=info msg="Raft: Compacting logs from 597494 to 597568" t=2018-09-12T15:52:16+0200
lvl=info msg="Raft: Snapshot to 597696 complete" t=2018-09-12T15:52:16+0200
lvl=warn msg="Failed to get current raft nodes: failed to fetch raft server address: failed to begin transaction: database is locked" t=2018-09-12T15:56:55+0200
lvl=warn msg="Failed to get current raft nodes: failed to fetch raft server address: failed to begin transaction: database is locked" t=2018-09-12T15:57:04+0200
lvl=warn msg="Failed to get current raft nodes: failed to fetch raft server address: failed to begin transaction: database is locked" t=2018-09-12T15:57:13+0200
lvl=warn msg="Failed to get current raft nodes: failed to fetch raft server address: failed to begin transaction: database is locked" t=2018-09-12T15:57:22+0200
lvl=warn msg="Failed to get current raft nodes: failed to fetch raft server address: failed to begin transaction: database is locked" t=2018-09-12T15:57:31+0200
lvl=warn msg="Failed to get current raft nodes: failed to fetch raft server address: failed to begin transaction: database is locked" t=2018-09-12T15:57:40+0200
lvl=warn msg="Failed to get current raft nodes: failed to fetch raft server address: failed to begin transaction: database is locked" t=2018-09-12T15:57:49+0200
lvl=info msg="Raft: Starting snapshot up to 597760" t=2018-09-12T15:57:52+0200
lvl=warn msg="Raft: Unable to get address for server id 1, using fallback address 0: failed to begin transaction: database is locked" t=2018-09-12T15:57:57+0200
lvl=info msg="Raft: Compacting logs from 597569 to 597632" t=2018-09-12T15:57:57+0200
lvl=info msg="Raft: Snapshot to 597760 complete" t=2018-09-12T15:57:57+0200
lvl=warn msg="Failed to get current raft nodes: failed to fetch raft server address: failed to begin transaction: database is locked" t=2018-09-12T15:57:58+0200
lvl=warn msg="Failed to get current raft nodes: failed to fetch raft server address: failed to begin transaction: database is locked" t=2018-09-12T15:58:07+0200
lvl=warn msg="Failed to get current raft nodes: failed to fetch raft server address: failed to begin transaction: database is locked" t=2018-09-12T15:58:16+0200
Post by Kees Bakker
If you temporarily move ~/.config/lxc somehere else (to "remove" all the remotes, among other things), does lxc command work?
I'll remember that for next time. Right now the server is working again.

Thanks

Continue reading on narkive:
Loading...