Abstract Graph copying is used in parallel implementations of functional languages on architectures with distributed memory. This paper will explore the costs of graph copying in detail. It will make clear that these costs can form a bottleneck for a class of serious parallel programs. This not only comprises practical divide and conquer style programs - of which parallel matrix multiplication is an example -, but also programs that use pipelines, such as the sieve of Erathostenes. One can observe that copying costs vary widely for different data structures. We will show how arrays can be used to reduce copying costs considerably. This resulted in significant speed-ups for the examples above.