TLUG Mailing List

On Wed, Nov 20, 2019 at 4:24 PM Stephen J. Turnbull <turnbull.stephen.fw@example.com> wrote:

CL writes:

> I ran into problems with my RAID

Which level (I have some guesses, but...)? Hardware, software
(guessing the latter since you mention mdadm)?

Software. RAID 5. 3Tb disks x 6->7, Lubuntu, mdadm.

You don't use LVM? (That may be obsolete, I use it because I've
always used it. DFWAB.)

Don't use LVM as it was too problematic when I was in my most newbie state. icinga is installed as a watch / monitor and running (which is probably an apples vs oranges response)

> a while back in which one member disk was acting wonky. I FAILed
> the wonky disk, installed a brand new one (same maker and model
> number) and performed a full resynch, which took about 30 days.

30 days doesn't sound reasonable for devices I'd consider plausible.
"33000kbps" for the reshaping implies a little over 8 hours/TB, and in
fact for RAID 5 I would expect only the new drive to be written (all
have to be read, though), so a fraction of the time for full
reshaping.

And yet, it wasn't considered all that unusual on other forums I checked at the time. Nor, are speeds of 40~60kbps for reshape considered all that odd and slowing down to a stop during the process is not considered all that unusual.

> 1. The RAID is "there", tests healthy

What tests? May as well give output for the tests, please.

Just the basics for now.

$ sudo mdadm --stop /dev/md0
mdadm: stopped /dev/md0

$ sudo mdadm --assemble --run --force --update=resync /dev/md0
mdadm: /dev/md0 has been started with 7 drives.

*NOTE: not the usual start. I used "--update=resync" because the verbose output I was receiving while the RAID was readonly referenced resyncing. I normally use "sudo mdadm --assemble /dev/md0" only and do not enumerate the member disks.

$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid5 sdd[0] sdl[6] sdk[7] sdj[4] sdi[3] sdf[2] sde[1]
14650670080 blocks super 1.2 level 5, 512k chunk, algorithm 2 [7/7] [UUUUUUU]
[>....................] reshape = 1.0% (30966724/2930134016) finish=3754.5min speed=12868K/sec
bitmap: 1/22 pages [4KB], 65536KB chunk

unused devices: <none>
onk-04@onk-04:~$ sudo mdadm -D /dev/md0
/dev/md0:
Version : 1.2
Creation Time : Tue Jul 24 20:07:56 2018
Raid Level : raid5
Array Size : 14650670080 (13971.97 GiB 15002.29 GB)
Used Dev Size : 2930134016 (2794.39 GiB 3000.46 GB)
Raid Devices : 7
Total Devices : 7
Persistence : Superblock is persistent

Intent Bitmap : Internal

Update Time : Thu Nov 21 11:29:47 2019
State : clean, reshaping
Active Devices : 7
Working Devices : 7
Failed Devices : 0
Spare Devices : 0

Layout : left-symmetric
Chunk Size : 512K

Consistency Policy : bitmap

Reshape Status : 1% complete
Delta Devices : 1, (6->7)

Name : ubuntu:0
UUID : 882dd249:20ffab80:ed73e6ca:8306118b
Events : 80524

Number Major Minor RaidDevice State
0 8 48 0 active sync /dev/sdd
1 8 64 1 active sync /dev/sde
2 8 80 2 active sync /dev/sdf
3 8 128 3 active sync /dev/sdi
4 8 144 4 active sync /dev/sdj
7 8 160 5 active sync /dev/sdk
6 8 176 6 active sync /dev/sdl

> 4. At this time, the RAID starts from boot as inactive.
> Performing a stop and assemble starts the reshape process, which is
> stuck around 1%. Reshape speeds start out circa 33,000kbps and
> fall to 1kbps in around 4~6 hours.

Does this performance degradation happen gradually or suddenly? If
suddenly, does it take 4-6 hours and then suddenly fall, or does it
happen in less than an hour? What other work is the machine doing?

Speed degradation is gradual. It would graph as a "J" laying on its side in which the rate of change in the speed decrease seems to also shrink over time until the speed approaches zero. I am not now using the following command so cannot share any output but, it is easy to watch using the command:

Every 600.0s: cat /proc/mdstat

To be honest, this does not sound like a healthy array to me. The job
to be done is not terribly complicated in itself (although doing it
while the array is active sounds like an exercise in concurrent
futility), and by adding a disk you provide plenty of buffer space for
the critical region. I see no algorithmic reason why the reshape
process should stall; in fact as the process goes on it should get
faster as it becomes unnecessary to buffer the transfers.

Are you seeing disk errors on the array component devices in the logs?

All disks are reported clean by mdadm.

> After about 18~24 hours, the reshape stops completely

How do you know it stops? Does it say "stopped" or does it say "0kbps"?

From the output of the above "watch 'cat /proc/mdstat'" command. I usually reboot at 1K but, I have twice seen cat /proc/mdstat simply print without the speed line included any longer.

> 5. sudo mdadm -D /dev/md0 produces completely normal output.

Please provide this output. Have you tried "-D --verbose"?

See above for -D. The --verbose flag does not change the output.

> I want to be able to mount and view the RAID without losing the
> data.

Have you tried mounting it read-only? That will prevent the reshape
from restarting, and according to my understanding of the mdadm
manpage, the array should be in a consistent state, and can be read.

Yes. I assembled it as readonly and was told the RAID could not be mounted in that state. I could not open it in that state. I will re-re-read the manpages to see if any further enlightenment dawns.

In the meantime, lsof has stopped reporting any activity for the RAID and dmeg is kicking out the following error report over and over:

? INFO: task mount:4104 blocked for more than 120 seconds.
? Tainted: P OE 5.0.0-36-generic #39-Ubuntu
? 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
? INFO: task md0_reshape:4069 blocked for more than 120 seconds.
? Tainted: P OE 5.0.0-36-generic #39-Ubuntu

So, new things; new fun; more to read up on.

Somebody with recovery experience may be able to help with getting the
data out of an array with a reshape in progress, but if mount ro
doesn't work, I think your only option for mount and view is to
complete the reshape and reassembly.

The mdadm programmer / maintainer Neil Brown is an unstoppable fount of knowledge. I wanted to solve the mount problem so I could just approach him with the speed issue (which is well documented in a number of *buntu, CentOS, ArchLinux, and Debian forums). Many of the "solutions" are hardware + OS specific and he knows the matrix of possibilities better than nearly everyone else.

Re: [tlug] [Suspected Spam] RAID not seen by system