Creating a CentOS mirror site
The best way to create a CentOS mirror is to use rsync. This has been documented elsewhere. However, if you are like me and you're stuck behind a corporate firewall and no one is willing to open port 873 for you, then you can do it all with wget over port 80. Each mirror site, usually has repositories for multiple versions of CentOS. I only wanted to mirror the repository for CentOS 6.0. Here's how I did it.
1. Set up a server running apache. I'm sure you'll figure it out somewhere else!
2. Create a repository to hold the CentOS mirror data. You'll need a fair amount of space as the size of the mirror for even just one version of CentOS can be over 20Gb, even without the ISOs.
mkdir -p /export/htdocs/pub/centos
Make this the root directory for you web server.
3. Pick a good mirror site to mirror from. You can find the list here.
4. Use wget to get the contents of the CentOS root, that are not specific to a given version, such as GPG keys etc.
cd /export/htdocs/pub/centos wget -r -np -nH --cut-dirs=2 \ --exclude-directories=pub/centos/3\*,pub/centos/4\*,pub/centos/5\*,pub/centos/6\* \ http://myfavouritemirror.com/pub/centos/
The wget arguments do the following:
|-np||Do not ascend to the parent directory when retrieving recursively|
|-nH||Disable generation of host-prefixed directories. By default, invoking Wget with ‘-r http://myfavouritemirror.com’ will create a structure of directories beginning with myfavouritemirror.com/. This option disables such behavior.|
|--cut-dirs=2||Along with the -nH argument, wget would create the directories pub/centos/. This gets rid of them so the data is downloaded directly into the directory you're in.|
|--exclude-directories=.....||Comma separated list of directories you don't want to download.|
You should now have a directory with GPG keys for all the versions and few other directories.
5. Use wget to download the version of CentOS that you need, in my case 6.0, while excluding the ISO installer images.
wget -r -np -nH --cut-dirs=2\ --include-directories=pub/centos/6.0 \ --exclude-directories=pub/centos/3\*,pub/centos/4\*,pub/centos/5\*,pub/centos/6,pub/centos/6.0/isos \ http://myfavouritemirror.com/pub/centos/
The new wget arguments do the following:
|--include-directories||I think this is pretty obvious!|
This will obviously download just the 6.0 directory and its contents, while excluding the the isos directory for this version.
You've now created your mirror site!
If you want to update it later to get some more update RPMs etc you can do this:
wget -r -N -np -nH --cut-dirs=2\ --include-directories=pub/centos/6.0 \ --exclude-directories=pub/centos/3\*,pub/centos/4\*,pub/centos/5\*,pub/centos/6,pub/centos/6.0/isos \ http://myfavouritemirror.com/pub/centos/
where the extra arguement is:
|-N||Turn on timestamping. This will compare the timestamp of the file on the local server and on the source server. If they are the same and the file size is the same, it will not download the file. If the timestamp or size are different it will download the file|