Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Extreme Slowness in IPFS Node Sync and Metadata Access on GCP

Hi everyone,
We are running two IPFS nodes on our Google Cloud Platform (GCP) infrastructure and have encountered extreme slowness in the distribution of uploaded files and accessing metadata stored on IPFS.

Current Configuration

IPFS Nodes:

  • Node 1: Accelerated DHT disabled, public IP, TCP port 4001 open.
  • Node 2: Accelerated DHT enabled, public IP, TCP port 4001 open.

Hardware:

  • VM Type: e2-standard-2 (2 vCPU, 8 GB RAM)
  • Disks: 50 GB for the operating system, 200 GB additional for IPFS data
  • Operating System: Ubuntu 22.04.2 LTS

Software Version:

  • Current Version: IPFS 0.29.0
  • Previous Version: IPFS 0.20.0 (no slowness issues when it was functioning)

Issues Encountered

File Distribution:

  • The distribution of files takes hundreds of hours, as evidenced by the logs.
  • Specific log from IPFS1:
    2024-08-02 19:30:23 ⚠️ Your system is struggling to keep up with DHT reprovides! 2024-08-02 19:30:23 This means your content could partially or completely inaccessible on the network. 2024-08-02 19:30:23 We observed that you recently provided 128 keys at an average rate of 1m20.099219968s per key. 2024-08-02 19:30:23 2024-08-02 19:30:23 💾 Your total CID count is ~29683 which would total at 660h26m25.146310144s reprovide process. 2024-08-02 19:30:23 2024-08-02 19:30:23 The total provide time needs to stay under your reprovide interval (22h0m0s) to prevent falling behind! 2024-08-02 19:30:23 2024-08-02 19:30:23 💡 Consider enabling the Accelerated DHT to enhance your reprovide throughput.

Metadata Access:

  • We also experience significant delays in accessing the metadata stored on IPFS.

Firewall Configuration

  • Open Ports: TCP 4001 for both nodes.

Symptoms

  • Even with adequate CPU, memory, and storage resources, the slowness persists.
  • We have verified that the nodes are correctly advertising their data in the DHT.

Resolution Attempts

Enabling Accelerated DHT:

  • On Node 2, the Accelerated DHT was enabled, but it did not resolve the issue.

Resource Check:

  • Increased RAM on Node 2 to better handle the load with Accelerated DHT.

Port Configuration:

  • Confirmed that only TCP port 4001 is open, as recommended.

Consultation of Forums and Resources:

  • Consulted various online sources suggesting difficulties in finding appropriate peers as a possible cause.

Observations

We found that "the provider calls keep searching for longer for appropriate peers. Undiallable nodes are a big issue right now". This suggests that there are many unreachable nodes causing significant delays.

Request for Support

We are unable to understand what is causing this extreme slowness, especially considering that this problem did not occur with the previous software version. Can anyone help us identify and resolve the issue?

Thank you.

1 0 68