Difference between revisions of "Manual Maintenance"
Martin Cupak (talk | contribs) m (→Connecting to your camera) |
Martin Cupak (talk | contribs) (added Replacing embedded PC board - Commell LE-37D (DFNSMALL, DFNKIT)) (Tag: visualeditor-switched) |
||
Line 199: | Line 199: | ||
$ rsync -rv opt usr root@172.16.1.101:/ | $ rsync -rv opt usr root@172.16.1.101:/ | ||
''Note: The above IP address is for local wired connection (over etherenet cable), for WiFi connection use IP <code>172.16.0.101</code>'' | ''Note: The above IP address is for local wired connection (over etherenet cable), for WiFi connection use IP <code>172.16.0.101</code>'' | ||
+ | |||
+ | == Replacing embedded PC board == | ||
+ | |||
+ | === Commell LE-37D (DFNSMALL, DFNKIT) === | ||
+ | |||
+ | When replacing the PC board, the system drive (mSATA SSD card) and optionally mobile network 3G/4G modem card is moved from the old PC to the new one. | ||
+ | |||
+ | The Ethernet networking will not work out of the box, because the udev subsystem automatically pairs the network interfaces (eth0, eth1) with unique MAC addresses for each board and remembers that setting. With the system drive in new PC board, udev recognizes the new network interfaces, but names them eth2 and eth3, which does not match with the network configuration. | ||
+ | |||
+ | To fix that, one needs to clear that configuration: PC needs to be booted with screen (we use cheap upcycled VGA panels in our lab most of the time but HDMI should work as well) and keyboard connected to be able to run script /root/bin/rm_udev.sh and then reboot. | ||
+ | |||
+ | ''Note: when the system is first booted and later a monitor connected, it most likely will not work. Keyboard should be no problem including connecting it later for a blind CTRL+ALT+DEL reboot.'' | ||
+ | |||
+ | ''Note: the udev network interface persistent config does not affect the mobile network 3G/4G modem card interface, however, the user needs to make sure the operator and modem type setting is correct for the current location.'' | ||
+ | |||
== Useful Commands == | == Useful Commands == | ||
$ df -h # checks if hard drives are mounted - list of mounted disk devices with disk usage/free information | $ df -h # checks if hard drives are mounted - list of mounted disk devices with disk usage/free information |
Revision as of 03:12, 16 April 2020
Contents
Connecting to your camera system
If you are ever required to perform any manual maintenance, first connect to your camera.
If you are unable to connect to the camera system locally via the network (ethernet or WiFi), or remotely over the VPN, you can log in locally using a HDMI or VGA monitor and USB keyboard.
Direct Connection
- Before turning on system, connect a keyboard using USB and a monitor using either HDMI or VGA *insert figure of where to connect*
- log in as
root
Checking removable hard drives
General notes:
- Before deploying cameras the drives should be empty - delete the data recorded eg during testing in the lab, from all drives including /data0
- It is a good practice to check the how the drives are full before replacing them when servicing the observatory. If it was running for several months, at least /data1 should be full. If that is not true, the observatory was not working and needs to be serviced. Also you can decide to replace only the drives that actually have some data on them and leave the empty ones in the box.
Now let's start - first switch the drives on and mount them:
$ python /opt/dfn-software/enable_ext-hd.py
In case of DFNEXT observatories, wait at least 20 seconds fro the drives to spin up. then mount the drives:
$ mount /data1 $ mount /data2 $ mount /data3
In case of DFNSMALL observatories, wait at least 40 seconds fro the system to recognize the USB enclosure and spin up the drives and then mount them:
$ mount -a
The next step is to list the drives - we are interested only in the data partitions:
$ df -h
In case of DFNEXT observatory with three 6TB removable drives installed and running for several weeks, you will get listing like this:
Filesystem Size Used Avail Use% Mounted on ... /dev/sda3 390G 55G 316G 15% /data0 /dev/sdb1 5.5T 1.1T 4.2T 21% /data1 ..... This drive is 21% full /dev/sdd1 5.5T 58M 5.2T 1% /data2 ..... This drive is empty /dev/sdc1 5.5T 89M 5.2T 1% /data3 ..... This drive is empty
In case of DFSMALL observatory with two 8TB removable drives installed and runnin $ cd /data0/DFNXXXNN/YYYY/MM/g for more than 1/2 year, now pretty much full of data, you will see:
Filesystem Size Used Avail Use% Mounted on ... /dev/sda5 406G 59G 327G 16% /data0 /dev/sdc1 7.3T 6.8T 90G 99% /data1 ..... This drive is full /dev/sdb1 7.3T 6.5T 367G 95% /data2 ..... This drive is nearly full
Note: the /data0 partition is on the system SSD drive, this one is available all the time and contains the recent data (last 1-2 nights) and logs.
When done with the drives check, unmount them, tell the OS to forget about the SATA devices (it's SATA hot-swap) and finally switch them off:
In case of DFNEXT observatory:
$ python /opt/dfn-software/disable_ext-hd.py
Note: this command actually internally calls also these commands
$ umount /data1 /data2 /data3 $ echo 1 > /sys/block/sdb/device/delete $ echo 1 > /sys/block/sdc/device/delete $ echo 1 > /sys/block/sdd/device/delete
so there is no need to run these individually in case of nominal conditions.
In case of DFSMALL observatory:
$ umount /data1 /data2 $ python /opt/dfn-software/disable_ext-hd.py
Installing new HDDs
1. Make sure the hard drives are powered off
Also consider what time it ts - there is daily task to move data from /data0 partition to the removable drives sheduled using crontab for 10:55 local time.
2. Physically replace the drives
Remember labeling the drives taken out of the observatory - put a sticker on the drive, note observatory type, number, site name and replacement date.
3. Format the new drives after replacing
DFNEXT observatory
Power on the enclosure with hard drives and start the formatting script:
$ python /opt/dfn-software/enable_ext-hd.py
wait 20 seconds, then probe the observatory type and HDDs connection type
$ cd /root/bin $ ./dfn_setup_data_hdds.sh -p
prints
Probe result: DFNEXT SATA /dev/sdb data2 /dev/sdc data1 /dev/sdd data3 Suggested command to format all drives: /root/bin/dfn_setup_data_hdds.sh /dev/sdb data1 /dev/sdc data2 /dev/sdd data3
To format all three drives, execute the suggested command
$ ./dfn_setup_data_hdds.sh /dev/sdb data1 /dev/sdc data2 /dev/sdd data3
Note: The formatting procedure includes SMART selftest of all the drives.
Tell the OS to forget about the SATA devices (it's SATA hot-swap) and power them off.
$ python /opt/dfn-software/disable_ext-hd.py
Note: this command actually internally calls also these commands Note: this command actually internally calls also these commands
$ umount /data1 /data2 /data3 $ echo 1 > /sys/block/sdb/device/delete $ echo 1 > /sys/block/sdc/device/delete $ echo 1 > /sys/block/sdd/device/delete
so there is no need to run these individually in case of nominal conditions.
DFNSMALL observatory
Power on the enclosure with hard drives and start the formatting script:
$ python /opt/dfn-software/enable_ext-hd.py
wait 40 seconds
$ cd /root/bin $ ./setup_usb_hdds_jmicron.sh
When prompted, for various settings:
- prompt gparted --> N
- prompt "Create partition /dev/sdb1, Format /dev/sdb1 as ext4" --> Y
- prompt "Create partition /dev/sdc1, Format /dev/sdc1 as ext4" --> Y
- wait for quick smart self test to finish, check result for the 1st drive, particularly the following lines:
... 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 ... # 1 Short offline Completed without error 00% 586 - ...
- press enter, check result for the 2nd drive the same way
- At the end, check that the freshly formated drives mounted and the expected capacities are listed
Filesystem Size Used Avail Use% Mounted on /dev/sdb1 5.5T 58M 5.2T 1% /data2 ..... this is 6TB drive /dev/sdc1 3.6T 68M 3.4T 1% /data1 ..... this is 4TB drive
And finally power off the
$ python /opt/dfn-software/disable_ext-hd.py
Transfer data from data0 to removable HDs manually
$ nohup /usr/local/bin/move_data_files.sh &
This will store info on move in nohup.out file in the present working directory. The data moving task will continue even if you disconnect from the camera.
Check CF card
To check if there are images on the camera CF card:
$ python /opt/dfn-software/enable_camera.py $ gphoto2 -L -R # this will list all files on camera
To format the CF card, see the dedicated instruction page here.
Capture control test
1. Run the test
$ /opt/dfn-software/int_control_test.sh
This will take ~10 mins. You should start to hear shutter clicks as test photos are taken.
To monitor interval test as it goes (in other terminal):
$ tail -f /data0/latest/*interval.txt
2. Check interval control test successfully took pictures
Check there are ~10 pictures taken at the time test was run in previous images
$ cd /data0/latest_prev $ ls
Check Shutter Count
$ cd /data0/latest_prev or folder with the format: $ cd /data0/DFNXXXNN/YYYY/MM/YYYY-MM-DD_DFNXXXNN_1XXXXXXXXX $ exiv2 -p a **image**.NEF | grep hutter or $ grep hutter *interval.txt
Checking GPS
First check GPS lock and position/time information acquired by the GPS and reported to the PC - run command:
$ cgps
See the upper part of screen for position and time. In the bottom part of the screen, the NMEA messages from GPS should be scrolling.
Press [Q] to exit cgps.
Note: No lock means no reception, as long as there is text scrolling in the bottom of the page, the GPS communicates with the observatory PC.
Second thing to check is that the capture control SW hets the leocation and time when it runs with GPS antenna connected and with good signal reception. Inspect the *interval.txt logs, either produced by capture control test or by regular overnight operation. In case of nominal GPS functionality, the ntp NMEA/PPS time correction should be active:
INFO, interval_control_lin, ntp, +SHM(0),.NMEA.,0,l,9,16,377,0.000,-11.851,3.472 INFO, interval_control_lin, ntp, *SHM(1),.PPS.,0,l,9,16,377,0.000,0.022,0.008
and coordinates should be passed from the GPS receiver (the last number '1' means GPS has lock:
INFO, interval_control_lin, GPS_lonlat, 135.274305, -30.857625, 156.26, 1
There is also a python script to query lhe leostick/arduino microcontroller for the GPS status
python /opt/dfn-software/leostick_get_status.py -g GPGGA,081358.000,3140.0427,S,11639.9456,E,1,16,0.6,195.02,M,-23.9,M,,
The above GPS sentence is example of receiver having lock (bold "1"), coordinates 31 deg 40.0427 min S, 116 deg 39.9456 min E, elevation 195.02 m, while the sentence below shows situation without lock (no reception).
python /opt/dfn-software/leostick_get_status.py -g GPGGA,045843.000,3102.8703,S,11550.3033,E,0,00,99.0,202.58,M,-27.9,M,,INFO, interval_control_lin
Software Updates
Automated software updates
As long as the observatory is connected to the Internet, the software that controls observatory auto updates daily from a dedicated DFN server. There are two attempts to do so in the afternoon local time, ~ 40 minutes before the daily reboot and ~ 20 minutes after; the default times are 3:30 PM and 4:30 PM for the SW update and 4:10 PM for reboot.
Manual software update over Internet
It might be handy to execute the network SW update manually, for example in case of testing or deploying new observatory that was off-line for some time or in transport or stored as a spare. In this case log in to the observatory and execute command
$ dfn_down_install_sw_from_server.sh
Manual software update using local copy
In case of remote site without Internet connection the only way to update the observatory software is to bring a copy of the software eg on laptop do the update locally.
The Australian DFN team members can find the latest stable software in the internal DFN repo in operation/SW/dfnsmall/stable (DFNSAMLL type of observatory) or operation/SW/dfnext/stable (DFNEXT type). External collaborators will be provided with a copy of the software on request.
Assuming the servicing person has Linux environment on her/his laptop, first step is to do a dry-run to check there are no command typo errors by running:
$ rsync -nrv opt usr root@172.16.1.101:/
If there are no errors or anomalies, just list of files that would copy, you can run the real update:
$ rsync -rv opt usr root@172.16.1.101:/
Note: The above IP address is for local wired connection (over etherenet cable), for WiFi connection use IP 172.16.0.101
Replacing embedded PC board
Commell LE-37D (DFNSMALL, DFNKIT)
When replacing the PC board, the system drive (mSATA SSD card) and optionally mobile network 3G/4G modem card is moved from the old PC to the new one.
The Ethernet networking will not work out of the box, because the udev subsystem automatically pairs the network interfaces (eth0, eth1) with unique MAC addresses for each board and remembers that setting. With the system drive in new PC board, udev recognizes the new network interfaces, but names them eth2 and eth3, which does not match with the network configuration.
To fix that, one needs to clear that configuration: PC needs to be booted with screen (we use cheap upcycled VGA panels in our lab most of the time but HDMI should work as well) and keyboard connected to be able to run script /root/bin/rm_udev.sh and then reboot.
Note: when the system is first booted and later a monitor connected, it most likely will not work. Keyboard should be no problem including connecting it later for a blind CTRL+ALT+DEL reboot.
Note: the udev network interface persistent config does not affect the mobile network 3G/4G modem card interface, however, the user needs to make sure the operator and modem type setting is correct for the current location.
Useful Commands
$ df -h # checks if hard drives are mounted - list of mounted disk devices with disk usage/free information $ lsblk # this command lists all hard disk devices in the system and where (if) they are mounted $ cgps # gives gps coordinates in a table if sat lock and monitors communication GPS->PC. Press [Q] to exit. $ ntpq -p # check NTP time correction status $ watch df # monitors df changes, good for checking data transfers $ du -hs * | grep G # will show folders with folders >GB $ crontab -l # shows the scheduled tasks.... good for finding commands you want to manually run now