Discussion:
Trying out migration, getting "error: checkpoint failed"
(too old to reply)
Giles Thomas
2015-05-01 17:03:59 UTC
Permalink
Hi there,

I've been taking a look at LXD and the migration feature sounds really
cool, but unfortunately I can't get it to work. I'm following Tycho
Andersen's blog post at
<http://tycho.ws/blog/2015/04/lxd-live-migration.html>, and when I run

lxc move lxd:migratee lxd2:migratee

I get the error

error: checkpoint failed

OS is Ubuntu Trusty 64-bit, running on Amazon EC2:

# uname -a
Linux ip-10-139-6-38 3.13.0-45-generic #74-Ubuntu SMP Tue Jan 13
19:36:28 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 14.04.2 LTS
Release: 14.04
Codename: trusty

There's no migration log in /var/log/lxd/migratee/, just an empty
lxc.log. Nothing being written to syslog.

Any advice on tracking down what going wrong would be much appreciated!


All the best,

Giles
--
Giles Thomas <***@pythonanywhere.com>

PythonAnywhere: Develop and host Python from your browser
<https://www.pythonanywhere.com/>

A product from PythonAnywhere LLP
17a Clerkenwell Road, London EC1M 5RD, UK
VAT No.: GB 893 5643 79
Registered in England and Wales as company number OC378414.
Registered address: 28 Ely Place, 3rd Floor, London EC1N 6TD, UK
Tycho Andersen
2015-05-04 13:15:20 UTC
Permalink
Hi Giles,
Post by Giles Thomas
Hi there,
I've been taking a look at LXD and the migration feature sounds really cool,
but unfortunately I can't get it to work. I'm following Tycho Andersen's
blog post at <http://tycho.ws/blog/2015/04/lxd-live-migration.html>, and
when I run
lxc move lxd:migratee lxd2:migratee
I get the error
error: checkpoint failed
# uname -a
Linux ip-10-139-6-38 3.13.0-45-generic #74-Ubuntu SMP Tue Jan 13
19:36:28 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 14.04.2 LTS
Release: 14.04
Codename: trusty
There's no migration log in /var/log/lxd/migratee/, just an empty lxc.log.
Nothing being written to syslog.
Any advice on tracking down what going wrong would be much appreciated!
What versions of criu and liblxc do you have installed? Can you look
in /var/lib/lxd/lxc/<container>/lxc.log (or
/var/log/lxd/<container>/lxc.log; there was a bug fixed recently that
moved it to the latter location).

Thanks,

Tycho
Post by Giles Thomas
All the best,
Giles
--
PythonAnywhere: Develop and host Python from your browser
<https://www.pythonanywhere.com/>
A product from PythonAnywhere LLP
17a Clerkenwell Road, London EC1M 5RD, UK
VAT No.: GB 893 5643 79
Registered in England and Wales as company number OC378414.
Registered address: 28 Ely Place, 3rd Floor, London EC1N 6TD, UK
_______________________________________________
lxc-users mailing list
http://lists.linuxcontainers.org/listinfo/lxc-users
Giles Thomas
2015-05-05 13:18:23 UTC
Permalink
Hi Tycho,

Thanks for the reply!
Post by Tycho Andersen
What versions of criu and liblxc do you have installed? Can you look
in /var/lib/lxd/lxc/<container>/lxc.log (or
/var/log/lxd/<container>/lxc.log; there was a bug fixed recently that
moved it to the latter location).
Heh, I hadn't realised that criu wasn't automatically installed as a
dependency for lxd (might be worth adding to the blog post?), so I
didn't have it installed at all.

However, installing it doesn't fix the problem. I currently have criu
1.4-1~ubuntu14.04.1~ppa1 and liblxc1
1.1.2+master~20150428-0938-0ubuntu1~trusty. The former was installed
just with an apt-get install criu, and the latter via following your
blog post pretty much robotically...

/var/log/lxd/migratee/lxc.log is present, but empty.


All the best,

Giles
--
Giles Thomas <***@pythonanywhere.com>

PythonAnywhere: Develop and host Python from your browser
<https://www.pythonanywhere.com/>

A product from PythonAnywhere LLP
17a Clerkenwell Road, London EC1M 5RD, UK
VAT No.: GB 893 5643 79
Registered in England and Wales as company number OC378414.
Registered address: 28 Ely Place, 3rd Floor, London EC1N 6TD, UK
Tycho Andersen
2015-05-05 13:43:05 UTC
Permalink
Post by Giles Thomas
Hi Tycho,
Thanks for the reply!
What versions of criu and liblxc do you have installed? Can you look in
/var/lib/lxd/lxc/<container>/lxc.log (or /var/log/lxd/<container>/lxc.log;
there was a bug fixed recently that moved it to the latter location).
Heh, I hadn't realised that criu wasn't automatically installed as a
dependency for lxd (might be worth adding to the blog post?), so I didn't
have it installed at all.
Ah yeah. It seems I forgot to mention anything about criu at all :)
Post by Giles Thomas
However, installing it doesn't fix the problem. I currently have criu
1.4-1~ubuntu14.04.1~ppa1 and liblxc1
1.1.2+master~20150428-0938-0ubuntu1~trusty. The former was installed just
with an apt-get install criu, and the latter via following your blog post
pretty much robotically...
/var/log/lxd/migratee/lxc.log is present, but empty.
Does /var/lib/lxd/lxc/migratee/lxc.log exist?

You might try building criu from their git, I'm not sure the version
in our PPA is new enough to actually work with liblxc 1.1.2.

Tycho
Post by Giles Thomas
All the best,
Giles
--
PythonAnywhere: Develop and host Python from your browser
<https://www.pythonanywhere.com/>
A product from PythonAnywhere LLP
17a Clerkenwell Road, London EC1M 5RD, UK
VAT No.: GB 893 5643 79
Registered in England and Wales as company number OC378414.
Registered address: 28 Ely Place, 3rd Floor, London EC1N 6TD, UK
_______________________________________________
lxc-users mailing list
http://lists.linuxcontainers.org/listinfo/lxc-users
Giles Thomas
2015-05-05 15:26:50 UTC
Permalink
Hi Tycho,
Post by Tycho Andersen
Post by Giles Thomas
What versions of criu and liblxc do you have installed? Can you look in
/var/lib/lxd/lxc/<container>/lxc.log (or /var/log/lxd/<container>/lxc.log;
there was a bug fixed recently that moved it to the latter location).
Heh, I hadn't realised that criu wasn't automatically installed as a
dependency for lxd (might be worth adding to the blog post?), so I didn't
have it installed at all.
Ah yeah. It seems I forgot to mention anything about criu at all :)
Post by Giles Thomas
However, installing it doesn't fix the problem. I currently have criu
1.4-1~ubuntu14.04.1~ppa1 and liblxc1
1.1.2+master~20150428-0938-0ubuntu1~trusty. The former was installed just
with an apt-get install criu, and the latter via following your blog post
pretty much robotically...
/var/log/lxd/migratee/lxc.log is present, but empty.
Does /var/lib/lxd/lxc/migratee/lxc.log exist?
Aha! It doesn't, but there is a file /var/lib/lxd/lxc/migratee/log,
which I missed last time around. I just tried the move again, with a
tail -f on that file running in the background, and got some output:

***@XXXXXXX:/var/log/lxd/migratee# lxc move lxd:migratee lxd2:migratee
lxc 1430839338.008 DEBUG lxc_commands -
commands.c:lxc_cmd_handler:888 - peer has disconnected
lxc 1430839338.010 DEBUG lxc_commands -
commands.c:lxc_cmd_handler:888 - peer has disconnected
lxc 1430839338.010 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1430839338.010 DEBUG lxc_commands -
commands.c:lxc_cmd_handler:888 - peer has disconnected
lxc 1430839338.012 DEBUG lxc_commands -
commands.c:lxc_cmd_handler:888 - peer has disconnected
lxc 1430839338.014 DEBUG lxc_commands -
commands.c:lxc_cmd_handler:888 - peer has disconnected
lxc 1430839338.014 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1430839338.016 DEBUG lxc_commands -
commands.c:lxc_cmd_handler:888 - peer has disconnected
lxc 1430839338.019 DEBUG lxc_commands -
commands.c:lxc_cmd_handler:888 - peer has disconnected
lxc 1430839338.020 DEBUG lxc_commands -
commands.c:lxc_cmd_handler:888 - peer has disconnected
lxc 1430839338.020 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1430839338.022 DEBUG lxc_commands -
commands.c:lxc_cmd_handler:888 - peer has disconnected
lxc 1430839338.025 DEBUG lxc_commands -
commands.c:lxc_cmd_handler:888 - peer has disconnected
lxc 1430839338.027 DEBUG lxc_commands -
commands.c:lxc_cmd_handler:888 - peer has disconnected
lxc 1430839338.027 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1430839338.028 DEBUG lxc_commands -
commands.c:lxc_cmd_handler:888 - peer has disconnected
lxc 1430839338.887 DEBUG lxc_commands -
commands.c:lxc_cmd_handler:888 - peer has disconnected
lxc 1430839338.889 DEBUG lxc_commands -
commands.c:lxc_cmd_handler:888 - peer has disconnected
lxc 1430839338.889 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1430839340.029 DEBUG lxc_commands -
commands.c:lxc_cmd_handler:888 - peer has disconnected
lxc 1430839340.031 DEBUG lxc_commands -
commands.c:lxc_cmd_handler:888 - peer has disconnected
lxc 1430839340.031 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
error: checkpoint failed
***@XXXXXXX:/var/log/lxd/migratee#

Anything useful there? Or anything that points at any other logs I
should be checking?
Post by Tycho Andersen
You might try building criu from their git, I'm not sure the version
in our PPA is new enough to actually work with liblxc 1.1.2.
Sure, I can definitely give that a go -- I'll wait for half an hour or
so, though, just in case you see the log above and immediately realise
what stupid thing I've done wrong ;-)


All the best,

Giles
--
Giles Thomas <***@pythonanywhere.com>

PythonAnywhere: Develop and host Python from your browser
<https://www.pythonanywhere.com/>

A product from PythonAnywhere LLP
17a Clerkenwell Road, London EC1M 5RD, UK
VAT No.: GB 893 5643 79
Registered in England and Wales as company number OC378414.
Registered address: 28 Ely Place, 3rd Floor, London EC1N 6TD, UK
Tycho Andersen
2015-05-05 15:50:38 UTC
Permalink
Post by Giles Thomas
Hi Tycho,
Post by Tycho Andersen
Post by Giles Thomas
What versions of criu and liblxc do you have installed? Can you look in
/var/lib/lxd/lxc/<container>/lxc.log (or /var/log/lxd/<container>/lxc.log;
there was a bug fixed recently that moved it to the latter location).
Heh, I hadn't realised that criu wasn't automatically installed as a
dependency for lxd (might be worth adding to the blog post?), so I didn't
have it installed at all.
Ah yeah. It seems I forgot to mention anything about criu at all :)
Post by Giles Thomas
However, installing it doesn't fix the problem. I currently have criu
1.4-1~ubuntu14.04.1~ppa1 and liblxc1
1.1.2+master~20150428-0938-0ubuntu1~trusty. The former was installed just
with an apt-get install criu, and the latter via following your blog post
pretty much robotically...
/var/log/lxd/migratee/lxc.log is present, but empty.
Does /var/lib/lxd/lxc/migratee/lxc.log exist?
Aha! It doesn't, but there is a file /var/lib/lxd/lxc/migratee/log, which I
missed last time around. I just tried the move again, with a tail -f on
lxc 1430839338.008 DEBUG lxc_commands -
commands.c:lxc_cmd_handler:888 - peer has disconnected
lxc 1430839338.010 DEBUG lxc_commands -
commands.c:lxc_cmd_handler:888 - peer has disconnected
lxc 1430839338.010 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1430839338.010 DEBUG lxc_commands -
commands.c:lxc_cmd_handler:888 - peer has disconnected
lxc 1430839338.012 DEBUG lxc_commands -
commands.c:lxc_cmd_handler:888 - peer has disconnected
lxc 1430839338.014 DEBUG lxc_commands -
commands.c:lxc_cmd_handler:888 - peer has disconnected
lxc 1430839338.014 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1430839338.016 DEBUG lxc_commands -
commands.c:lxc_cmd_handler:888 - peer has disconnected
lxc 1430839338.019 DEBUG lxc_commands -
commands.c:lxc_cmd_handler:888 - peer has disconnected
lxc 1430839338.020 DEBUG lxc_commands -
commands.c:lxc_cmd_handler:888 - peer has disconnected
lxc 1430839338.020 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1430839338.022 DEBUG lxc_commands -
commands.c:lxc_cmd_handler:888 - peer has disconnected
lxc 1430839338.025 DEBUG lxc_commands -
commands.c:lxc_cmd_handler:888 - peer has disconnected
lxc 1430839338.027 DEBUG lxc_commands -
commands.c:lxc_cmd_handler:888 - peer has disconnected
lxc 1430839338.027 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1430839338.028 DEBUG lxc_commands -
commands.c:lxc_cmd_handler:888 - peer has disconnected
lxc 1430839338.887 DEBUG lxc_commands -
commands.c:lxc_cmd_handler:888 - peer has disconnected
lxc 1430839338.889 DEBUG lxc_commands -
commands.c:lxc_cmd_handler:888 - peer has disconnected
lxc 1430839338.889 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1430839340.029 DEBUG lxc_commands -
commands.c:lxc_cmd_handler:888 - peer has disconnected
lxc 1430839340.031 DEBUG lxc_commands -
commands.c:lxc_cmd_handler:888 - peer has disconnected
lxc 1430839340.031 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
error: checkpoint failed
Anything useful there? Or anything that points at any other logs I should
be checking?
Can you check the lxd stderr by chance (probably lives in /var/log
somewhere depending on what init system you're using)? I suspect that
liblxc is rejecting dumping the container in its internal predump
checks, but the above log doesn't say way unfortunately.

Sorry for all the confusion, the logging stuff here is still a bit of
a mess, although a bit better on the current lxd master.

Tycho
Post by Giles Thomas
Post by Tycho Andersen
You might try building criu from their git, I'm not sure the version
in our PPA is new enough to actually work with liblxc 1.1.2.
Sure, I can definitely give that a go -- I'll wait for half an hour or so,
though, just in case you see the log above and immediately realise what
stupid thing I've done wrong ;-)
All the best,
Giles
--
PythonAnywhere: Develop and host Python from your browser
<https://www.pythonanywhere.com/>
A product from PythonAnywhere LLP
17a Clerkenwell Road, London EC1M 5RD, UK
VAT No.: GB 893 5643 79
Registered in England and Wales as company number OC378414.
Registered address: 28 Ely Place, 3rd Floor, London EC1N 6TD, UK
_______________________________________________
lxc-users mailing list
http://lists.linuxcontainers.org/listinfo/lxc-users
Giles Thomas
2015-05-05 16:10:56 UTC
Permalink
Hi Tycho,
Post by Tycho Andersen
Can you check the lxd stderr by chance (probably lives in /var/log
somewhere depending on what init system you're using)? I suspect that
liblxc is rejecting dumping the container in its internal predump
checks, but the above log doesn't say way unfortunately. Sorry for all
the confusion, the logging stuff here is still a bit of a mess,
although a bit better on the current lxd master.
Oddly, there didn't appear to be one; "find /var/log -name \*lxd\*" just
found "/var/log/lxd". Nothing relevant-looking in "/var/log/upstart/"
apart from "lxc-net.log", which has an "Address already in use" error:

dnsmasq: failed to create listening socket for 10.0.3.1: Address
already in use
Failed to setup lxc-net.

Doubly-oddly, there's a "/etc/init/lxd.conf" *and* a "/etc/init.d/lxd",
which confuses me a little. Does that not mean that both init and
upstart will try to start it? (My knowledge of the workings of init
systems in not as in-depth as I would like.) Should I remove one of them
then change the remaining one to write stdout/err somewhere sensible?

I can also see that there are still init and upstart scripts for lxcfs,
which is a bit messy -- the "apt-get remove lxcfs" should presumably
have deleted them -- but they depend on "/usr/bin/lxcfs", which
definitely doesn't exist, so I guess that's not the problem.


All the best,

Giles
--
Giles Thomas <***@pythonanywhere.com>

PythonAnywhere: Develop and host Python from your browser
<https://www.pythonanywhere.com/>

A product from PythonAnywhere LLP
17a Clerkenwell Road, London EC1M 5RD, UK
VAT No.: GB 893 5643 79
Registered in England and Wales as company number OC378414.
Registered address: 28 Ely Place, 3rd Floor, London EC1N 6TD, UK
Tycho Andersen
2015-05-06 14:40:52 UTC
Permalink
Hi Giles,
Post by Giles Thomas
Hi Tycho,
Post by Tycho Andersen
Can you check the lxd stderr by chance (probably lives in /var/log
somewhere depending on what init system you're using)? I suspect that
liblxc is rejecting dumping the container in its internal predump checks,
but the above log doesn't say way unfortunately. Sorry for all the
confusion, the logging stuff here is still a bit of a mess, although a bit
better on the current lxd master.
Oddly, there didn't appear to be one; "find /var/log -name \*lxd\*" just
found "/var/log/lxd". Nothing relevant-looking in "/var/log/upstart/" apart
dnsmasq: failed to create listening socket for 10.0.3.1: Address already
in use
Failed to setup lxc-net.
Doubly-oddly, there's a "/etc/init/lxd.conf" *and* a "/etc/init.d/lxd",
which confuses me a little. Does that not mean that both init and upstart
will try to start it? (My knowledge of the workings of init systems in not
as in-depth as I would like.) Should I remove one of them then change the
remaining one to write stdout/err somewhere sensible?
You could, but it may be easier to just stop the lxd service and run
it manually so that it writes stderr to the terminal you're using.

Looking at the code path, it looks like there are a few (really
unlikely) ways it could fail without writing anything to the log (such
as OOM or not being able to make a temporary directory, but it's root
so as long as you have enough disk/ram it /should/ die with some error
message). If you can't find anything, it may be worth building a
liblxc from source and trying to debug things that way.
Post by Giles Thomas
I can also see that there are still init and upstart scripts for lxcfs,
which is a bit messy -- the "apt-get remove lxcfs" should presumably have
deleted them -- but they depend on "/usr/bin/lxcfs", which definitely
doesn't exist, so I guess that's not the problem.
`remove` doesn't always remove config files, `purge` is supposed to
though.

Tycho
Post by Giles Thomas
All the best,
Giles
--
PythonAnywhere: Develop and host Python from your browser
<https://www.pythonanywhere.com/>
A product from PythonAnywhere LLP
17a Clerkenwell Road, London EC1M 5RD, UK
VAT No.: GB 893 5643 79
Registered in England and Wales as company number OC378414.
Registered address: 28 Ely Place, 3rd Floor, London EC1N 6TD, UK
_______________________________________________
lxc-users mailing list
http://lists.linuxcontainers.org/listinfo/lxc-users
Tycho Andersen
2015-05-06 15:29:48 UTC
Permalink
Post by Tycho Andersen
Hi Giles,
Post by Giles Thomas
Hi Tycho,
Post by Tycho Andersen
Can you check the lxd stderr by chance (probably lives in /var/log
somewhere depending on what init system you're using)? I suspect that
liblxc is rejecting dumping the container in its internal predump checks,
but the above log doesn't say way unfortunately. Sorry for all the
confusion, the logging stuff here is still a bit of a mess, although a bit
better on the current lxd master.
Oddly, there didn't appear to be one; "find /var/log -name \*lxd\*" just
found "/var/log/lxd". Nothing relevant-looking in "/var/log/upstart/" apart
dnsmasq: failed to create listening socket for 10.0.3.1: Address already
in use
Failed to setup lxc-net.
Doubly-oddly, there's a "/etc/init/lxd.conf" *and* a "/etc/init.d/lxd",
which confuses me a little. Does that not mean that both init and upstart
will try to start it? (My knowledge of the workings of init systems in not
as in-depth as I would like.) Should I remove one of them then change the
remaining one to write stdout/err somewhere sensible?
You could, but it may be easier to just stop the lxd service and run
it manually so that it writes stderr to the terminal you're using.
Looking at the code path, it looks like there are a few (really
unlikely) ways it could fail without writing anything to the log (such
as OOM or not being able to make a temporary directory, but it's root
so as long as you have enough disk/ram it /should/ die with some error
message). If you can't find anything, it may be worth building a
liblxc from source and trying to debug things that way.
Sorry, I did just find one notable exception with the current git
master: liblxc doesn't complain when excing criu fails. Do you have
criu installed in a place where liblxc can find it?

I posted a patch to fix this particular case, but it seems likely
that's where your problem is.

Tycho
Post by Tycho Andersen
Post by Giles Thomas
I can also see that there are still init and upstart scripts for lxcfs,
which is a bit messy -- the "apt-get remove lxcfs" should presumably have
deleted them -- but they depend on "/usr/bin/lxcfs", which definitely
doesn't exist, so I guess that's not the problem.
`remove` doesn't always remove config files, `purge` is supposed to
though.
Tycho
Post by Giles Thomas
All the best,
Giles
--
PythonAnywhere: Develop and host Python from your browser
<https://www.pythonanywhere.com/>
A product from PythonAnywhere LLP
17a Clerkenwell Road, London EC1M 5RD, UK
VAT No.: GB 893 5643 79
Registered in England and Wales as company number OC378414.
Registered address: 28 Ely Place, 3rd Floor, London EC1N 6TD, UK
_______________________________________________
lxc-users mailing list
http://lists.linuxcontainers.org/listinfo/lxc-users
Giles Thomas
2015-05-07 11:32:58 UTC
Permalink
Hi Tycho,
Post by Tycho Andersen
Sorry, I did just find one notable exception with the current git
master: liblxc doesn't complain when excing criu fails. Do you have
criu installed in a place where liblxc can find it? I posted a patch
to fix this particular case, but it seems likely that's where your
problem is. Tycho
It's installed in /usr/sbin/criu -- the lxc monitor is running as root,
so that should be OK, right?


All the best,

Giles
--
Giles Thomas <***@pythonanywhere.com>

PythonAnywhere: Develop and host Python from your browser
<https://www.pythonanywhere.com/>

A product from PythonAnywhere LLP
17a Clerkenwell Road, London EC1M 5RD, UK
VAT No.: GB 893 5643 79
Registered in England and Wales as company number OC378414.
Registered address: 28 Ely Place, 3rd Floor, London EC1N 6TD, UK
Tycho Andersen
2015-05-08 16:03:24 UTC
Permalink
Hi Giles,

Sorry for the delay.
Post by Giles Thomas
Hi Tycho,
liblxc doesn't complain when excing criu fails. Do you have criu installed
in a place where liblxc can find it? I posted a patch to fix this
particular case, but it seems likely that's where your problem is. Tycho
It's installed in /usr/sbin/criu -- the lxc monitor is running as root, so
that should be OK, right?
I think so, but obviously something is wrong :). If you cat
/proc/`pidof lxd`/environ, is /usr/sbin in its path? It may be worth
upgrading to the lxd/lxd-client from git master; I wrote a patch a few
days ago so you can do:

lxc info migratee --show-log

and get the lxc log output, which should have the error you're
experiencing.

Tycho
Post by Giles Thomas
All the best,
Giles
--
PythonAnywhere: Develop and host Python from your browser
<https://www.pythonanywhere.com/>
A product from PythonAnywhere LLP
17a Clerkenwell Road, London EC1M 5RD, UK
VAT No.: GB 893 5643 79
Registered in England and Wales as company number OC378414.
Registered address: 28 Ely Place, 3rd Floor, London EC1N 6TD, UK
_______________________________________________
lxc-users mailing list
http://lists.linuxcontainers.org/listinfo/lxc-users
Giles Thomas
2015-05-08 17:04:01 UTC
Permalink
Hi Tycho,
Post by Tycho Andersen
Post by Giles Thomas
Hi Tycho,
liblxc doesn't complain when excing criu fails. Do you have criu installed
in a place where liblxc can find it? I posted a patch to fix this
particular case, but it seems likely that's where your problem is. Tycho
It's installed in /usr/sbin/criu -- the lxc monitor is running as root, so
that should be OK, right?
I think so, but obviously something is wrong :). If you cat
/proc/`pidof lxd`/environ, is /usr/sbin in its path? It may be worth
upgrading to the lxd/lxd-client from git master; I wrote a patch a few
lxc info migratee --show-log
and get the lxc log output, which should have the error you're
experiencing.
Thanks for the reply! I'm out of the office at the moment, but will
check this out when I'm back in next Thursday.


All the best,

Giles
--
Giles Thomas <***@pythonanywhere.com>

PythonAnywhere: Develop and host Python from your browser
<https://www.pythonanywhere.com/>

A product from PythonAnywhere LLP
17a Clerkenwell Road, London EC1M 5RD, UK
VAT No.: GB 893 5643 79
Registered in England and Wales as company number OC378414.
Registered address: 28 Ely Place, 3rd Floor, London EC1N 6TD, UK
Giles Thomas
2015-05-15 17:30:05 UTC
Permalink
Hi Tycho,
Post by Tycho Andersen
Sorry for the delay.
Likewise!
Post by Tycho Andersen
Post by Giles Thomas
liblxc doesn't complain when excing criu fails. Do you have criu installed
in a place where liblxc can find it? I posted a patch to fix this
particular case, but it seems likely that's where your problem is. Tycho
It's installed in /usr/sbin/criu -- the lxc monitor is running as root, so
that should be OK, right?
I think so, but obviously something is wrong :). If you cat
/proc/`pidof lxd`/environ, is /usr/sbin in its path?
Yes, it is.
Post by Tycho Andersen
It may be worth
upgrading to the lxd/lxd-client from git master; I wrote a patch a few
lxc info migratee --show-log
and get the lxc log output, which should have the error you're
experiencing.
I built lxd from source and re-ran the test; here's what I got from the
--show-log:


***@XXXXXXX:~# $GOPATH/bin/lxc info migratee --show-log
Name: migratee
Status: RUNNING
Init: 1647
Ips:
(none)

Log:

lxc 1431710609.087 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710609.091 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710609.097 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710609.102 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710802.699 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710802.703 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710802.709 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710802.715 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710803.578 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710804.702 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710884.178 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710884.181 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710884.188 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710884.194 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710885.058 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710886.166 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710895.481 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710895.485 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710895.492 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710895.498 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state


Any clues there?


All the best,

Giles
--
Giles Thomas <***@pythonanywhere.com>

PythonAnywhere: Develop and host Python from your browser
<https://www.pythonanywhere.com/>

A product from PythonAnywhere LLP
17a Clerkenwell Road, London EC1M 5RD, UK
VAT No.: GB 893 5643 79
Registered in England and Wales as company number OC378414.
Registered address: 28 Ely Place, 3rd Floor, London EC1N 6TD, UK
Tycho Andersen
2015-05-15 17:59:53 UTC
Permalink
Post by Giles Thomas
Hi Tycho,
Post by Tycho Andersen
Sorry for the delay.
Likewise!
Post by Tycho Andersen
Post by Giles Thomas
liblxc doesn't complain when excing criu fails. Do you have criu installed
in a place where liblxc can find it? I posted a patch to fix this
particular case, but it seems likely that's where your problem is. Tycho
It's installed in /usr/sbin/criu -- the lxc monitor is running as root, so
that should be OK, right?
I think so, but obviously something is wrong :). If you cat
/proc/`pidof lxd`/environ, is /usr/sbin in its path?
Yes, it is.
Post by Tycho Andersen
It may be worth
upgrading to the lxd/lxd-client from git master; I wrote a patch a few
lxc info migratee --show-log
and get the lxc log output, which should have the error you're
experiencing.
I built lxd from source and re-ran the test; here's what I got from the
Name: migratee
Status: RUNNING
Init: 1647
(none)
lxc 1431710609.087 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710609.091 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710609.097 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710609.102 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710802.699 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710802.703 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710802.709 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710802.715 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710803.578 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710804.702 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710884.178 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710884.181 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710884.188 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710884.194 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710885.058 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710886.166 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710895.481 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710895.485 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710895.492 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
lxc 1431710895.498 DEBUG lxc_commands -
commands.c:lxc_cmd_get_state:574 - 'migratee' is in 'RUNNING' state
Any clues there?
Unfortunately not. I suspect it still can't find criu and it just
isn't finding the binary. Can you symlink it into /bin just to be
sure?

Tycho
Post by Giles Thomas
All the best,
Giles
--
PythonAnywhere: Develop and host Python from your browser
<https://www.pythonanywhere.com/>
A product from PythonAnywhere LLP
17a Clerkenwell Road, London EC1M 5RD, UK
VAT No.: GB 893 5643 79
Registered in England and Wales as company number OC378414.
Registered address: 28 Ely Place, 3rd Floor, London EC1N 6TD, UK
_______________________________________________
lxc-users mailing list
http://lists.linuxcontainers.org/listinfo/lxc-users
Giles Thomas
2015-05-27 17:26:56 UTC
Permalink
Hi Tycho,

Sorry again for the slow turnaround!
I suspect it still can't find criu and it just isn't finding the
binary. Can you symlink it into /bin just to be sure?
That didn't help. However, installing criu from their github repo
seems to have moved things on a bit. Now, instead of getting "error:
checkpoint failed", I get "error: restore failed". Inside
/var/log/lxd/migratee, there is a file called
"migration_dump_2015-05-27T17:18:45Z.log", about 536K long. I've not
attached it, as I figure that would be pretty annoying for everyone else
on the list, but I can sent it directly to you if it would be useful.
There is also a 78K lxc.log.

On the destination machine, there's also a
/var/log/lxd/migratee/lxc.log, which is significantly shorter; here are
the contents:

lxc 1432744458.034 INFO lxc_confile -
confile.c:config_idmap:1390 - read uid map: type u nsid 0 hostid 100000
range 65536
lxc 1432744458.034 INFO lxc_confile -
confile.c:config_idmap:1390 - read uid map: type g nsid 0 hostid 100000
range 65536
lxc 1432744544.115 INFO lxc_confile -
confile.c:config_idmap:1390 - read uid map: type u nsid 0 hostid 100000
range 65536
lxc 1432744544.115 INFO lxc_confile -
confile.c:config_idmap:1390 - read uid map: type g nsid 0 hostid 100000
range 65536
lxc 1432744765.397 INFO lxc_confile -
confile.c:config_idmap:1390 - read uid map: type u nsid 0 hostid 100000
range 65536
lxc 1432744765.398 INFO lxc_confile -
confile.c:config_idmap:1390 - read uid map: type g nsid 0 hostid 100000
range 65536
lxc 1432747103.562 INFO lxc_confile -
confile.c:config_idmap:1390 - read uid map: type u nsid 0 hostid 100000
range 65536
lxc 1432747103.563 INFO lxc_confile -
confile.c:config_idmap:1390 - read uid map: type g nsid 0 hostid 100000
range 65536
lxc 1432747128.877 ERROR lxc_criu - criu.c:criu_ok:333 -
couldn't find devices.deny = c 5:1 rwm


All the best,

Giles
--
Giles Thomas <***@pythonanywhere.com>

PythonAnywhere: Develop and host Python from your browser
<https://www.pythonanywhere.com/>

A product from PythonAnywhere LLP
17a Clerkenwell Road, London EC1M 5RD, UK
VAT No.: GB 893 5643 79
Registered in England and Wales as company number OC378414.
Registered address: 28 Ely Place, 3rd Floor, London EC1N 6TD, UK
Tycho Andersen
2015-05-27 17:33:46 UTC
Permalink
Post by Giles Thomas
Hi Tycho,
Sorry again for the slow turnaround!
I suspect it still can't find criu and it just isn't finding the binary.
Can you symlink it into /bin just to be sure?
That didn't help. However, installing criu from their github repo seems to
have moved things on a bit. Now, instead of getting "error: checkpoint
failed", I get "error: restore failed". Inside /var/log/lxd/migratee,
there is a file called "migration_dump_2015-05-27T17:18:45Z.log", about 536K
long. I've not attached it, as I figure that would be pretty annoying for
everyone else on the list, but I can sent it directly to you if it would be
useful. There is also a 78K lxc.log.
On the destination machine, there's also a /var/log/lxd/migratee/lxc.log,
lxc 1432744458.034 INFO lxc_confile -
confile.c:config_idmap:1390 - read uid map: type u nsid 0 hostid 100000
range 65536
lxc 1432744458.034 INFO lxc_confile -
confile.c:config_idmap:1390 - read uid map: type g nsid 0 hostid 100000
range 65536
lxc 1432744544.115 INFO lxc_confile -
confile.c:config_idmap:1390 - read uid map: type u nsid 0 hostid 100000
range 65536
lxc 1432744544.115 INFO lxc_confile -
confile.c:config_idmap:1390 - read uid map: type g nsid 0 hostid 100000
range 65536
lxc 1432744765.397 INFO lxc_confile -
confile.c:config_idmap:1390 - read uid map: type u nsid 0 hostid 100000
range 65536
lxc 1432744765.398 INFO lxc_confile -
confile.c:config_idmap:1390 - read uid map: type g nsid 0 hostid 100000
range 65536
lxc 1432747103.562 INFO lxc_confile -
confile.c:config_idmap:1390 - read uid map: type u nsid 0 hostid 100000
range 65536
lxc 1432747103.563 INFO lxc_confile -
confile.c:config_idmap:1390 - read uid map: type g nsid 0 hostid 100000
range 65536
lxc 1432747128.877 ERROR lxc_criu - criu.c:criu_ok:333 -
couldn't find devices.deny = c 5:1 rwm
Ah, this is a sanity check to make sure that various container config
properties are set. It looks like things aren't set on the destination
host correctly; I think there was a bug with this in the 0.9 client,
fixed by 6b5595d03dff7d360f05fa48ee6198d71e7f1ef4, so you may want to
upgrade to 0.10.

Tycho
Post by Giles Thomas
All the best,
Giles
--
PythonAnywhere: Develop and host Python from your browser
<https://www.pythonanywhere.com/>
A product from PythonAnywhere LLP
17a Clerkenwell Road, London EC1M 5RD, UK
VAT No.: GB 893 5643 79
Registered in England and Wales as company number OC378414.
Registered address: 28 Ely Place, 3rd Floor, London EC1N 6TD, UK
_______________________________________________
lxc-users mailing list
http://lists.linuxcontainers.org/listinfo/lxc-users
Giles Thomas
2015-05-27 17:37:02 UTC
Permalink
Post by Tycho Andersen
Post by Giles Thomas
lxc 1432747128.877 ERROR lxc_criu - criu.c:criu_ok:333 -
couldn't find devices.deny = c 5:1 rwm
Ah, this is a sanity check to make sure that various container config
properties are set. It looks like things aren't set on the destination
host correctly; I think there was a bug with this in the 0.9 client,
fixed by 6b5595d03dff7d360f05fa48ee6198d71e7f1ef4, so you may want to
upgrade to 0.10.
Do I have to build from source for that? I had to rebuild the machines
to run the test, and used the ppa:ubuntu-lxc/lxd-git-master, but that's
0.9 and it looks like it's not been updated for a few weeks.


All the best,

Giles
--
Giles Thomas <***@pythonanywhere.com>

PythonAnywhere: Develop and host Python from your browser
<https://www.pythonanywhere.com/>

A product from PythonAnywhere LLP
17a Clerkenwell Road, London EC1M 5RD, UK
VAT No.: GB 893 5643 79
Registered in England and Wales as company number OC378414.
Registered address: 28 Ely Place, 3rd Floor, London EC1N 6TD, UK
Tycho Andersen
2015-05-27 17:41:24 UTC
Permalink
Post by Tycho Andersen
Post by Giles Thomas
lxc 1432747128.877 ERROR lxc_criu - criu.c:criu_ok:333 -
couldn't find devices.deny = c 5:1 rwm
Ah, this is a sanity check to make sure that various container config
properties are set. It looks like things aren't set on the destination
host correctly; I think there was a bug with this in the 0.9 client,
fixed by 6b5595d03dff7d360f05fa48ee6198d71e7f1ef4, so you may want to
upgrade to 0.10.
Do I have to build from source for that? I had to rebuild the machines to
run the test, and used the ppa:ubuntu-lxc/lxd-git-master, but that's 0.9 and
it looks like it's not been updated for a few weeks.
Yes, unfortunately we still don't have automated builds the ppas. I
think stgraber will be uploading 0.10 later today when the official
release announcement goes out, so you can just wait until then if you
want.

Tycho
All the best,
Giles
--
PythonAnywhere: Develop and host Python from your browser
<https://www.pythonanywhere.com/>
A product from PythonAnywhere LLP
17a Clerkenwell Road, London EC1M 5RD, UK
VAT No.: GB 893 5643 79
Registered in England and Wales as company number OC378414.
Registered address: 28 Ely Place, 3rd Floor, London EC1N 6TD, UK
_______________________________________________
lxc-users mailing list
http://lists.linuxcontainers.org/listinfo/lxc-users
Giles Thomas
2015-05-27 17:43:25 UTC
Permalink
Post by Tycho Andersen
Post by Tycho Andersen
Post by Giles Thomas
lxc 1432747128.877 ERROR lxc_criu - criu.c:criu_ok:333 -
couldn't find devices.deny = c 5:1 rwm
Ah, this is a sanity check to make sure that various container config
properties are set. It looks like things aren't set on the destination
host correctly; I think there was a bug with this in the 0.9 client,
fixed by 6b5595d03dff7d360f05fa48ee6198d71e7f1ef4, so you may want to
upgrade to 0.10.
Do I have to build from source for that? I had to rebuild the machines to
run the test, and used the ppa:ubuntu-lxc/lxd-git-master, but that's 0.9 and
it looks like it's not been updated for a few weeks.
Yes, unfortunately we still don't have automated builds the ppas. I
think stgraber will be uploading 0.10 later today when the official
release announcement goes out, so you can just wait until then if you
want.
OK, will do.


Giles
Post by Tycho Andersen
Tycho
All the best,
Giles
--
PythonAnywhere: Develop and host Python from your browser
<https://www.pythonanywhere.com/>
A product from PythonAnywhere LLP
17a Clerkenwell Road, London EC1M 5RD, UK
VAT No.: GB 893 5643 79
Registered in England and Wales as company number OC378414.
Registered address: 28 Ely Place, 3rd Floor, London EC1N 6TD, UK
_______________________________________________
lxc-users mailing list
http://lists.linuxcontainers.org/listinfo/lxc-users
_______________________________________________
lxc-users mailing list
http://lists.linuxcontainers.org/listinfo/lxc-users
--
Giles Thomas <***@pythonanywhere.com>

PythonAnywhere: Develop and host Python from your browser
<https://www.pythonanywhere.com/>

A product from PythonAnywhere LLP
17a Clerkenwell Road, London EC1M 5RD, UK
VAT No.: GB 893 5643 79
Registered in England and Wales as company number OC378414.
Registered address: 28 Ely Place, 3rd Floor, London EC1N 6TD, UK
Continue reading on narkive:
Loading...