Creating a CentOS mirror site

From Peter Pap's Technowiki
Revision as of 05:46, 13 July 2011 by Ppapa (talk | contribs)

Jump to: navigation, search

The best way to create a CentOS mirror is to use rsync. This has been documented elsewhere. However, if you are like me and you're stuck behind a corporate firewall and no one is willing to open port 873 for you, then you can do it all with wget over port 80. Each mirror site, usually has repositories for multiple versions of CentOS. I only wanted to mirror the repository for CentOS 5.6. Here's how I did it.

1. Set up a server running apache. I'm sure you'll figure it out somewhere else!

2. Create a repository to hold the CentOS mirror data. You'll need a fair amount of space as the size of the mirror for even just one version of CentOS can be over 20Gb, even without the ISOs.

 mkdir -p /export/htdocs/pub/centos

Make this the root directory for you web server.

3. Pick a good mirror site to mirror from. You can find the list here.

4. Use wget to get the contents of the CentOS root, that are not specific to a given version, such as GPG keys etc.

 cd /export/htdocs/pub/centos
 wget -r -np -nH --cut-dirs=2 --exclude-directories=pub/centos/3.9,pub/centos/4,pub/centos/4.3,pub/centos/4.4,pub/centos/4.5,pub/centos/4.6,pub/centos/4.7,pub/centos/4.8,pub/centos/4.9,pub/centos/5,pub/centos/5.0,pub/centos/5.1,pub/centos/5.2,pub/centos/5.3,pub/centos/5.4,pub/centos/5.5,pub/centos/5.6 http://myfavouritemirror.com/pub/centos/

The wget arguments do the following:

-r recursively download
-np Do not ascend to the parent directory when retrieving recursively
-nH Disable generation of host-prefixed directories. By default, invoking Wget with ‘-r http://myfavouritemirror.com’ will create a structure of directories beginning with myfavouritemirror.com/. This option disables such behavior.
--cut-dirs=2 Along with the -nH argument, wget would create the directories /pub/centos/. This gets rid of them so the data is downloaded directly into the directory you're in.
--exclude-directories=..... Comma separated list of directories you don't want to download.