Skip to main content

r/linuxadmin

members
online

Need help recovering LVM Need help recovering LVM
Need help recovering LVM

Hi, I'm really hoping that someone can help.

For a decade I've had my entire life stored on two 5TB disks in a raid1 configuration. Each disk is partitioned into three corresponding partitions:

  1. 50GB /

  2. 4GB swap

  3. 4.94TB LVM

/dev/md2 is / and /dev/md3 is the physical volume for the LVM. (They used to be md0 and md1, but last time a disk died and I replaced it, the md devices renumbered themselves - I never understood why but it worked so I didn't complain.)

The LVM has one VG, which has four LVs (/home, /srv, /usr/local and /var).

Yesterday sdb died. So I ordered a new one, unplugged it and rebooted. (They are USB disks btw - those Seagate Expansion Desk thingies.)

This time the md devices didn't renumber themselves, they're still md2 and md3. md2 is fine (albeit degraded, having only one drive present). But the system is working fine - a few services didn't start because various bits of /var were missing but after recreating them all is ok.

The problem is that all the LVM info has vanished. /dev/mapper is empty, pvscan and vgscan and lvscan all return nothing at all. /dev/sda3 is alive and well and still an active part of /dev/md3 (which, like md2, is clean and degraded).

/etc/lvm/backup contains a 2535-byte file called fatboy-VG1 (fatboy is the name of the machine), dated 7th December last year.

Can anyone tell me how to go about extracting the LVM info - either from that backup file or from /dev/sda3 itself - to recreate my volumes?

Otherwise I have lost everything - photos, correspondence, absolutely everything.

Thanks. Apologies if I have failed to provide obvious info - I don't know where to start.

upvotes comments

Time series data shouldn't make you wait. InfluxDB doesn't.
Image Time series data shouldn't make you wait. InfluxDB doesn't.


State of systemd-resolved and DNSSEC? Is it still experimental? State of systemd-resolved and DNSSEC? Is it still experimental?

So back in 2023 I found this post from the lead developer of systemd after struggling with getting DNSSEC to work reliably with systemd-resolved:

https://github.com/systemd/systemd/issues/25676#issuecomment-1634810897

He states that DNSSEC support is experimental.

It's almost 3 years later and I can't really find any information that it went from experimental to stable since then.

Does anyone know if it's "safe" to use DNSSEC with systemd-resolved since 257.9 (Debian 13)?


NFS over 1Gb: avg queue grows under sustained writes even though server and TCP look fine NFS over 1Gb: avg queue grows under sustained writes even though server and TCP look fine

I was able to solve with BDI, I just set max_bytes and enabled strictlimit and sunrpc.tcp_slot_table_entries=32 , with nconnect=4 with async.

Its works perfectly.

ok actually, nconnect=8 and sunrpc.tcp_slot_table_entries=128 sunrpc.tcp_max_slot_table_entries=128, are the better for supporting commands like "find ." or "ls -R" alonside of transferring files.

thats my full mount options for future reference, if anybody have same problem:

this mount options are optimized for 1 client, very hard caching + nocto. If you have multiple reader/writer, check before using

-t nfs -o vers=3,async,nconnect=8,rw,nocto,actimeo=600,noatime,nodiratime,rsize=1048576,wsize=1048576,hard,fsc  

I avoid nfsv4 since it didn't work properly with fsc, it was using new headers for fsc which I do not have on my kernel.

---
Hey,

I’m trying to understand some NFS behavior and whether this is just expected under saturation or if I’m missing something.

Setup:

  • Linux client with NVMe

  • NAS server (Synology 1221+)

  • 1 Gbps link between them

  • Tested both NFSv3 and NFSv4.1

  • rsize/wsize 1M, hard, noatime

  • Also tested with nconnect=4

Under heavy write load (e.g. rsync), throughput sits around ~110–115 MB/s, which makes sense for 1Gb. TCP looks clean (low RTT, no retransmits), server CPU and disks are mostly idle.

But on the client, nfsiostat shows avg queue growing to 30–50 seconds under sustained load. RTT stays low, but queue keeps increasing.

Things I tried:

  • nconnect=4 → distributes load across multiple TCP connections, but queue still grows under sustained writes.

  • NFSv4.1 instead of v3 → same behavior.

  • Limiting rsync with --bwlimit (~100 MB/s) → queue stabilizes and latency stays reasonable.

  • Removing bwlimit → queue starts growing again.

So it looks like when the producer writes faster than the 1Gb link can drain, the Linux page cache just keeps buffering and the NFS client queue grows indefinitely.

One confusing thing: with nconnect=4, rsync sometimes reports 300–400 MB/s write speed, even though the network is obviously capped at 1Gb. I assume that’s just page cache buffering, but it makes problem worse imo.

The main problem is: I cannot rely on per-application limits like --bwlimit. Multiple applications use this mount, and I need the mount itself to behave more like a slow disk (i.e., block writers earlier instead of buffering gigabytes and exploding latency).

I also don’t want to change global vm.dirty_* settings because the client has NVMe and other workloads.

Is this just normal Linux page cache + NFS behavior under sustained saturation?
Is there any way to enforce a per-mount write limit or backpressure mechanism for NFS?

Trying to understand if this is just how it works or if there’s a cleaner architectural solution.

Thanks.