1 |
|
---|
2 | Testbox Imaging (Backup / Restore)
|
---|
3 | ==================================
|
---|
4 |
|
---|
5 |
|
---|
6 | Introduction
|
---|
7 | ------------
|
---|
8 |
|
---|
9 | This document is explores deloying a very simple drive imaging solution to help
|
---|
10 | avoid needing to manually reinstall testboxes when a disk goes bust or the OS
|
---|
11 | install seems to be corrupted.
|
---|
12 |
|
---|
13 |
|
---|
14 | Definitions / Glossary
|
---|
15 | ======================
|
---|
16 |
|
---|
17 | See AutomaticTestingRevamp.txt.
|
---|
18 |
|
---|
19 |
|
---|
20 | Objectives
|
---|
21 | ==========
|
---|
22 |
|
---|
23 | - Off site, no admin interaction (no need for ILOM or similar).
|
---|
24 | - OS independent.
|
---|
25 | - Space and bandwidth efficient.
|
---|
26 | - As automatic as possible.
|
---|
27 | - Logging.
|
---|
28 |
|
---|
29 |
|
---|
30 | Overview of the Solution
|
---|
31 | ========================
|
---|
32 |
|
---|
33 | Here is a brief summary:
|
---|
34 |
|
---|
35 | - Always boot testboxes via PXE using PXELINUX.
|
---|
36 | - Default configuration is local boot (hard disk / SSD)
|
---|
37 | - Restore/backup action triggered by machine specific PXE config.
|
---|
38 | - Boots special debian maintenance install off NFS.
|
---|
39 | - A maintenance service (systemd style) does the work.
|
---|
40 | - The service reads action from TFTP location and performs it.
|
---|
41 | - When done the service removes the TFTP machine specific config
|
---|
42 | and reboots the system.
|
---|
43 |
|
---|
44 | Maintenance actions are:
|
---|
45 | - backup
|
---|
46 | - backup-again
|
---|
47 | - restore
|
---|
48 | - refresh-info
|
---|
49 | - rescue
|
---|
50 |
|
---|
51 | Possible modifier that indicates a subset of disk on testboxes with other OSes
|
---|
52 | installed. Support for partition level backup/restore is not explored here.
|
---|
53 |
|
---|
54 |
|
---|
55 | How to use
|
---|
56 | ----------
|
---|
57 |
|
---|
58 | To perform one of the above maintenance actions on a testbox copy the
|
---|
59 | ``do-<action>.cfg`` file over to testbox hex-IP config file name. Then trigger
|
---|
60 | a reboot. The box will then boot the NFS rooted debian image and execute the
|
---|
61 | maintenance action. On success, it will remove the testbox hex-IP config file
|
---|
62 | and reboot again.
|
---|
63 |
|
---|
64 | The hex-IP config file name for a testbox is generated by taking the IPv4
|
---|
65 | address, converting each of the 4 numbers to upper cased hex and remove the
|
---|
66 | dots. Given the IP ``10.42.1.96`` you can use the unix ``printf`` command
|
---|
67 | like this ``printf "%02X%02X02X%02X" 10 42 1 96`` to get the name.
|
---|
68 |
|
---|
69 |
|
---|
70 | Storage Server
|
---|
71 | ==============
|
---|
72 |
|
---|
73 | The storage server will have three areas used here. Using NFS for all three
|
---|
74 | avoids extra work getting CIFS sharing right too (NFS is already a pain).
|
---|
75 |
|
---|
76 | 1. /export/testbox-tftp - TFTP config area. Read-write.
|
---|
77 | 2. /export/testbox-backup - Images and logs. Read-write.
|
---|
78 | 3. /export/testbox-nfsroot - Custom debian. Read-only, no root squash.
|
---|
79 |
|
---|
80 |
|
---|
81 | TFTP (/export/testbox-tftp)
|
---|
82 | ============================
|
---|
83 |
|
---|
84 | The testbox-tftp share needs to be writable, root squashing is okay.
|
---|
85 |
|
---|
86 | We need files from both PXELINUX and SYSLINUX to make this work now. On a
|
---|
87 | debian system, the ``pxelinux`` and ``syslinux`` packages needs to be
|
---|
88 | installed. We actually do this further down when setting up the nfsroot, so
|
---|
89 | it's possible to get them from there by postponing this step a little. On
|
---|
90 | debian 8.6.0 the PXELINUX files are found in ``/usr/lib/PXELINUX`` and the
|
---|
91 | SYSLINUX ones in ``/usr/lib/syslinux``.
|
---|
92 |
|
---|
93 | The initial PXE image as well as associated modules comes in three variants,
|
---|
94 | BIOS, 32-bit EFI and 64-bit EFI. We'll only need the BIOS one for now.
|
---|
95 | Perform the following copy operations::
|
---|
96 |
|
---|
97 | cp /usr/lib/PXELINUX/pxelinux.0 /mnt/testbox-tftp/
|
---|
98 | cp /usr/lib/syslinux/modules/*/ldlinux.* /mnt/testbox-tftp/
|
---|
99 | cp -R /usr/lib/syslinux/modules/bios /mnt/testbox-tftp/
|
---|
100 | cp -R /usr/lib/syslinux/modules/efi32 /mnt/testbox-tftp/
|
---|
101 | cp -R /usr/lib/syslinux/modules/efi64 /mnt/testbox-tftp/
|
---|
102 |
|
---|
103 |
|
---|
104 | For simplicitly, all the testboxes boot using good old fashioned BIOS, no EFI.
|
---|
105 | However, it doesn't really hurt to be prepared.
|
---|
106 |
|
---|
107 | The PXELINUX related files goes in the root of the testbox-tftp share. (As
|
---|
108 | mentioned further down, these can be installed on a debian system by running
|
---|
109 | ``apt-get install pxelinux syslinux``.) We need the ``*pxelinux.0`` files
|
---|
110 | typically found in ``/usr/lib/PXELINUX/`` on debian systems (recent ones
|
---|
111 | anyway). It is possible we may need one ore more fo the modules [1]_ that
|
---|
112 | ships with PXELINUX/SYSLINUX, so do copy ``/usr/lib/syslinux/modules`` to
|
---|
113 | ``testbox-tftp/modules`` as well.
|
---|
114 |
|
---|
115 |
|
---|
116 | The directory layout related to the configuration files is dictated by the
|
---|
117 | PXELINUX configuration file searching algorithm [2]_. Create a subdirectory
|
---|
118 | ``pxelinux.cfg/`` under ``testbox-tftp`` and create the world readable file
|
---|
119 | ``default`` with the following content::
|
---|
120 |
|
---|
121 | PATH bios
|
---|
122 | DEFAULT local-boot
|
---|
123 | LABEL local-boot
|
---|
124 | LOCALBOOT
|
---|
125 |
|
---|
126 | This will make the default behavior to boot the local disk system.
|
---|
127 |
|
---|
128 | Create ``pxelinux.cfg/do-backup``, ``pxelinux.cfg/do-backup-again``,
|
---|
129 | ``pxelinux.cfg/do-restore``, ``pxelinux.cfg/do-refresh-info``, and
|
---|
130 | ``pxelinux.cfg/do-rescue`` configuration files on the form::
|
---|
131 |
|
---|
132 | PATH bios
|
---|
133 | DEFAULT maintenance
|
---|
134 | LABEL maintenance
|
---|
135 | MENU LABEL Maintenance (NFS)
|
---|
136 | KERNEL maintenance-boot/vmlinuz-3.16.0-4-amd64
|
---|
137 | APPEND initrd=maintenance-boot/initrd.img-3.16.0-4-amd64 testbox-action-backup ro ip=dhcp aufs=tmpfs boot=nfs root=/dev/nfs nfsroot=10.42.1.1:/export/testbox-nfsroot,ro nfsvers=3 nfsrootdebug
|
---|
138 | LABEL local-boot
|
---|
139 | LOCALBOOT
|
---|
140 |
|
---|
141 | When you want to preform an action on a testbox, copy the ``do-<action>`` to
|
---|
142 | ``pxeclient.cfg/<HEX-ip-addr>`` and trigger a boot of the testbox. The machine
|
---|
143 | config will be removed automatically once the action has been successfully
|
---|
144 | completed.
|
---|
145 |
|
---|
146 |
|
---|
147 |
|
---|
148 | Images and logs (/export/testbox-backup)
|
---|
149 | =========================================
|
---|
150 |
|
---|
151 | The testbox-backup share needs to be writable, root squashing is okay.
|
---|
152 |
|
---|
153 | In the root there must be a file ``testbox-backup`` so we can easily tell
|
---|
154 | whether we've actually mounted the share or are just staring at an empty mount
|
---|
155 | point directory.
|
---|
156 |
|
---|
157 | The ``testbox-maintenance.sh`` script maintains a global log in the root
|
---|
158 | directory that's called ``maintenance.log``. Errors will be logged there as
|
---|
159 | well as a ping and the action.
|
---|
160 |
|
---|
161 | We use a directory layout based on dotted decimal IP addresses here, so for a
|
---|
162 | server with the IP 10.40.41.42 all its file will be under ``10.40.41.42/``:
|
---|
163 |
|
---|
164 | ``<hostname>``
|
---|
165 | The name of the testbox (empty file). Help finding a testbox by name.
|
---|
166 |
|
---|
167 | ``testbox-info.txt``
|
---|
168 | Information about the testbox. Starting off with the name, decimal IP,
|
---|
169 | PXELINUX style hexadecimal IP, and more.
|
---|
170 |
|
---|
171 | ``maintenance.log``
|
---|
172 | Maintenance log file recording what the maintenance service does.
|
---|
173 |
|
---|
174 | ``disk-devices.lst``
|
---|
175 | Optional list of disk devices to consider backuping up or restoring. This is
|
---|
176 | intended for testboxes with additional disks that are used for other purposes
|
---|
177 | and should touched.
|
---|
178 |
|
---|
179 | ``sda.raw.gz``
|
---|
180 | The gzipped raw copy of the sda device of the testbox.
|
---|
181 |
|
---|
182 | ``sd[bcdefgh].raw.gz``
|
---|
183 | The gzipped raw copy sdb, sdc, sde, sdf, sdg, sdh, etc if any of them exists
|
---|
184 | and are disks/SSDs.
|
---|
185 |
|
---|
186 |
|
---|
187 | Note! If it turns out we can be certain to get a valid host name, we might just
|
---|
188 | switch to use the hostname as the directory name instead of the IP.
|
---|
189 |
|
---|
190 |
|
---|
191 | Debian NFS root (/export/testbox-nfsroot)
|
---|
192 | ==========================================
|
---|
193 |
|
---|
194 | The testbox-nfsroot share should be read-only and must **not** have root
|
---|
195 | squashing enabled.
|
---|
196 |
|
---|
197 | There are several ways of creating a debian nfsroot, but since we've got a
|
---|
198 | tool like VirtualBox around we've just installed it in a VM, prepared it,
|
---|
199 | and copied it onto the NFS server share.
|
---|
200 |
|
---|
201 | As of writing debian 8.6.0 is current, so a minimal 64-bit install of it was
|
---|
202 | done in a VM. After installation the following modifications was done:
|
---|
203 |
|
---|
204 | - ``apt-get install pxelinux syslinux initramfs-tools zip gddrescue joe``
|
---|
205 | and optionally ``apt-get install smbclient cifs-utils``.
|
---|
206 |
|
---|
207 |
|
---|
208 | - ``/etc/default/grub`` was modified to set ``GRUB_CMDLINE_LINUX_DEFAULT`` to
|
---|
209 | ``""`` instead of ``"quiet"``. This allows us to see messages during boot
|
---|
210 | and perhaps spot why something doesn't work on a testbox. Regenerate the
|
---|
211 | grub configuration file by running ``update-grub`` afterwards.
|
---|
212 |
|
---|
213 | - Create the directory ``/etc/systemd/system/[email protected]`` and create
|
---|
214 | the file ``noclear.conf`` in it with the following content::
|
---|
215 |
|
---|
216 | [Service]
|
---|
217 | TTYVTDisallocate=no
|
---|
218 |
|
---|
219 | This stops getty from clearing VT1 and let us see the tail of the boot up
|
---|
220 | messages, which includes messages from the testbox-maintenance service.
|
---|
221 |
|
---|
222 | - Mount the testbox-nfsroot under ``/mnt/`` with write privileges. (The write
|
---|
223 | privileges are temporary - don't forget to remove them later on.)::
|
---|
224 |
|
---|
225 | mount -t nfs myserver.com:/export/testbox-nfsroot
|
---|
226 |
|
---|
227 | Note! Adding ``-o nfsvers=3`` may help with some NTFv4 servers.
|
---|
228 |
|
---|
229 | - Copy the debian root and dev file system onto nfsroot. If you have ssh
|
---|
230 | access to the NFS server, the quickest way to do it is to use ``tar``::
|
---|
231 |
|
---|
232 | tar -cz --one-file-system -f /mnt/testbox-maintenance-nfsroot.tar.gz . dev/
|
---|
233 |
|
---|
234 | An alternative is ``cp -ax . /mnt/. && cp -ax dev/. /mnt/dev/.`` but this
|
---|
235 | is quite a bit slower, obviously.
|
---|
236 |
|
---|
237 | - chroot into the nfsroot: ``chroot /mnt/``
|
---|
238 |
|
---|
239 | - ``mount -o proc proc /proc``
|
---|
240 |
|
---|
241 | - ``mount -o sysfs sysfs /sys``
|
---|
242 |
|
---|
243 | - ``mkdir /mnt/testbox-tftp /mnt/testbox-backup``
|
---|
244 |
|
---|
245 | - Recreate ``/etc/fstab`` with::
|
---|
246 |
|
---|
247 | proc /proc proc defaults 0 0
|
---|
248 | /dev/nfs / nfs defaults 1 1
|
---|
249 | 10.42.1.1:/export/testbox-tftp /mnt/testbox-tftp nfs nfsvers=3 2 2
|
---|
250 | 10.42.1.1:/export/testbox-backup /mnt/testbox-backup nfs nfsvers=3 3 3
|
---|
251 |
|
---|
252 | - Do ``mount /mnt/testbox-tftp && mount /mnt/testbox-backup`` to mount the
|
---|
253 | two shares. This may be a good time to execute the instructions in the
|
---|
254 | sections above relating to these two shares.
|
---|
255 |
|
---|
256 | - Edit ``/etc/initramfs-tools/initramfs.conf`` and change the ``MODULES``
|
---|
257 | value from ``most`` to ``netboot``.
|
---|
258 |
|
---|
259 | - Append ``aufs`` to ``/etc/initramfs-tools/modules``. The advanced
|
---|
260 | multi-layered unification filesystem (aufs) enables us to use a
|
---|
261 | read-only NFS root. [3]_ [4]_ [5]_
|
---|
262 |
|
---|
263 | - Create ``/etc/initramfs-tools/scripts/init-bottom/00_aufs_init`` as
|
---|
264 | an executable file with the following content::
|
---|
265 |
|
---|
266 | #!/bin/sh
|
---|
267 | # Don't run during update-initramfs:
|
---|
268 | case "$1" in
|
---|
269 | prereqs)
|
---|
270 | exit 0;
|
---|
271 | ;;
|
---|
272 | esac
|
---|
273 |
|
---|
274 | modprobe aufs
|
---|
275 | mkdir -p /ro /rw /aufs
|
---|
276 | mount -t tmpfs tmpfs /rw -o noatime,mode=0755
|
---|
277 | mount --move $rootmnt /ro
|
---|
278 | mount -t aufs aufs /aufs -o noatime,dirs=/rw:/ro=ro
|
---|
279 | mkdir -p /aufs/rw /aufs/ro
|
---|
280 | mount --move /ro /aufs/ro
|
---|
281 | mount --move /rw /aufs/rw
|
---|
282 | mount --move /aufs /root
|
---|
283 | exit 0
|
---|
284 |
|
---|
285 | - Update the init ramdisk: ``update-initramfs -u -k all``
|
---|
286 |
|
---|
287 | Note! It may be necessary to do ``mount -t tmpfs tmpfs /var/tmp`` to help
|
---|
288 | this operation succeed.
|
---|
289 |
|
---|
290 | - Copy ``/boot`` to ``/mnt/testbox-tftp/maintenance-boot/``.
|
---|
291 |
|
---|
292 | - Copy the ``testbox-maintenance.sh`` file found in the same directory as this
|
---|
293 | document to ``/root/scripts/`` (need to create the dir) and make it
|
---|
294 | executable.
|
---|
295 |
|
---|
296 | - Create the systemd service file for the maintenance service as
|
---|
297 | ``/etc/systemd/system/testbox-maintenance.service`` with the content::
|
---|
298 |
|
---|
299 | [Unit]
|
---|
300 | Description=Testbox Maintenance
|
---|
301 | After=network.target
|
---|
302 | [email protected]
|
---|
303 |
|
---|
304 | [Service]
|
---|
305 | Type=oneshot
|
---|
306 | RemainAfterExit=True
|
---|
307 | ExecStart=/root/scripts/testbox-maintenance.sh
|
---|
308 | ExecStartPre=/bin/echo -e \033%G
|
---|
309 | ExecReload=/bin/kill -HUP $MAINPID
|
---|
310 | WorkingDirectory=/tmp
|
---|
311 | Environment=TERM=xterm
|
---|
312 | StandardOutput=journal+console
|
---|
313 |
|
---|
314 | [Install]
|
---|
315 | WantedBy=multi-user.target
|
---|
316 |
|
---|
317 | - Enable our service: ``systemctl enable /etc/systemd/system/testbox-maintenance.service``
|
---|
318 |
|
---|
319 | - xxxx ... more ???
|
---|
320 |
|
---|
321 | - Before leaving the chroot, do ``mount /proc /sys /mnt/testbox-*``.
|
---|
322 |
|
---|
323 |
|
---|
324 | - Testing the setup from a VM is kind of useful (if the nfs server can be
|
---|
325 | convinced to accept root nfs mounts from non-privileged clinet ports):
|
---|
326 |
|
---|
327 | - Create a VM using the 64-bit debian profile. Let's call it "pxe-vm".
|
---|
328 | - Mount the TFTP share somewhere, like M: or /mnt/testbox-tftp.
|
---|
329 | - Reconfigure the NAT DHCP and TFTP bits::
|
---|
330 |
|
---|
331 | VBoxManage setextradata pxe-vm VBoxInternal/PDM/DriverTransformations/pxe/AboveDriver NAT
|
---|
332 | VBoxManage setextradata pxe-vm VBoxInternal/PDM/DriverTransformations/pxe/Action mergeconfig
|
---|
333 | VBoxManage setextradata pxe-vm VBoxInternal/PDM/DriverTransformations/pxe/Config/TFTPPrefix M:/
|
---|
334 | VBoxManage setextradata pxe-vm VBoxInternal/PDM/DriverTransformations/pxe/Config/BootFile pxelinux.0
|
---|
335 |
|
---|
336 | - Create the file ``testbox-tftp/pxelinux.cfg/0A00020F`` containing::
|
---|
337 |
|
---|
338 | PATH bios
|
---|
339 | DEFAULT maintenance
|
---|
340 | LABEL maintenance
|
---|
341 | MENU LABEL Maintenance (NFS)
|
---|
342 | KERNEL maintenance-boot/vmlinuz-3.16.0-4-amd64
|
---|
343 | APPEND initrd=maintenance-boot/initrd.img-3.16.0-4-amd64 ro ip=dhcp aufs=tmpfs \
|
---|
344 | boot=nfs root=/dev/nfs nfsroot=10.42.1.1:/export/testbox-nfsroot
|
---|
345 | LABEL local-boot
|
---|
346 | LOCALBOOT
|
---|
347 |
|
---|
348 |
|
---|
349 | -----
|
---|
350 |
|
---|
351 | .. [1] See http://www.syslinux.org/wiki/index.php?title=Category:Modules
|
---|
352 | .. [2] See http://www.syslinux.org/wiki/index.php?title=PXELINUX#Configuration
|
---|
353 | .. [3] See https://en.wikipedia.org/wiki/Aufs
|
---|
354 | .. [4] See http://shitwefoundout.com/wiki/Diskless_ubuntu
|
---|
355 | .. [5] See http://debianaddict.com/2012/06/19/diskless-debian-linux-booting-via-dhcppxenfstftp/
|
---|
356 |
|
---|
357 |
|
---|
358 | -----
|
---|
359 |
|
---|
360 | :Status: $Id: TestBoxImaging.txt 64524 2016-11-02 20:13:35Z vboxsync $
|
---|
361 | :Copyright: Copyright (C) 2010-2016 Oracle Corporation.
|
---|
362 |
|
---|
363 |
|
---|