One of the more recent features in Pulp is the ability to use the Pulp server to push repositories out to be hosted on external servers. These Content Delivery Servers (CDS) can be used to distribute content across firewalls and geographic locations, as well as providing an added layer of security by selectively deploying only a subset of the Pulp server’s content to a specific instance.

Installation and Configuration

A server is configured to run as a CDS by first installing the Pulp CDS package and its dependencies:

[root@localhost] yum install pulp-cds

If not already installed, httpd will be installed as part of the installation process. The virtual host for the CDS is also installed through the /etc/httpd/conf.d/pulp-cds.conf file.

The CDS will need to be able to resolve the hostname of the Pulp server in order to be able to download content from it. This will typically be done through DNS, but in development environments this often means needing to edit /etc/hosts to add an entry for the Pulp server.

Additionally, the CDS will have to be configured to use the messaging broker on the Pulp server so it can be manipulated from the server itself. This is done through the /etc/pulp/cds.conf file:

[server]
host = pulp.example.com

Like the Pulp server itself, the CDS uses Apache to host its repositories. The typical steps to configure the Apache instance with an SSL certificate should be taken at this point.

Once the configuration changes have been made, the CDS processes are started through the init script:

[root@localhost ~] service pulp-cds start
Starting goferd                                            [  OK  ]
Starting httpd:                                            [  OK  ]

Usage

Once the CDS server is running, it must be registered to a Pulp server. A CDS may only be registered to one Pulp server at a given time. Once registered, the CDS will only accept commands from the Pulp server that it is currently registered to. When a CDS is unregistered from a Pulp server, it is once again open to be registered by a different Pulp server.

Registration

Registration is done through the pulp-admin cds commands. The registration command requires the hostname of the CDS as identification. Keep in mind, however, that the Pulp server itself does not need to be able to resolve the CDS hostname to an IP address. Rather, the hostname is used to determine the unique message bus ID for that CDS. A display name can also be specified using the --name argument.

[root@localhost ~] pulp-admin cds register --hostname cds.example.com
Successfully registered CDS [cds.example.com]

Registered CDS instances can be displayed through pulp-admin cds list command:

[root@localhost ~] pulp-admin cds list
+------------------------------------------+
                CDS Instances
+------------------------------------------+
 
Name                	cds.example.com           
Hostname            	cds.example.com           
Description         	None                     
Repos               	None                     
Last Sync           	Never                    
Status:
   Responding       	Yes                      
   Last Heartbeat   	2011-05-12 17:07:21.834959+00:00

Repository Association

A registered CDS is neat, but not very useful. Repositories are associated with a CDS to customize the content they serve. Repositories that are protected on the Pulp server using repository authentication will also be protected on any CDS instance they are associated with (more on that in a future blog).

[root@localhost ~] pulp-admin cds associate_repo --hostname cds.example.com --repoid demo
Successfully associated CDS [cds.example.com] with repo [demo]

CDS Synchronization

Repository association configures the server’s knowledge of which repositories belong on which CDS instances. In order to get the bits to the CDS instance, it must be synchronized. The pulp-admin cds sync command is used to trigger this process and the pulp-admin cds status command is used to check its progress (in the future, this will be enhanced to more closely resemble the abilities for monitoring repo syncs).

[root@localhost ~] pulp-admin cds sync --hostname cds.example.com
Sync for CDS [rhino.marvel.u] started
Use "cds status" to check on the progress
 
[root@localhost ~] pulp-admin cds status --hostname cds.example.com
+------------------------------------------+
                 CDS Status
+------------------------------------------+
 
Name                	cds.example.com           
Hostname            	cds.example.com           
Description         	None                     
Repos               	demo                     
Last Sync           	2011-05-12 13:20:35-04:00
Status:
   Responding       	Yes                      
   Last Heartbeat   	2011-05-12 17:23:54.328154+00:00

Repository contents are stored in the /var/lib/pulp-cds directory on the CDS instance:

[root@localhost] ll -R /var/lib/pulp-cds/
/var/lib/pulp-cds/:
total 8
-rw-r--r--. 1 root root    5 May 12 13:23 cds_repo_list
drwxr-xr-x. 3 root root 4096 May 12 13:23 demo
 
/var/lib/pulp-cds/demo:
total 8
-rw-r--r--. 1 root root 2184 May 12 13:23 pulp-demo-1.0-1.fc14.x86_64.rpm
drwxr-xr-x. 2 root root 4096 May 12 13:23 repodata
 
/var/lib/pulp-cds/demo/repodata:
total 28
-rw-r--r--. 1 root root  745 May 12 13:23 filelists.sqlite.bz2
-rw-r--r--. 1 root root  308 May 12 13:23 filelists.xml.gz
-rw-r--r--. 1 root root  736 May 12 13:23 other.sqlite.bz2
-rw-r--r--. 1 root root  357 May 12 13:23 other.xml.gz
-rw-r--r--. 1 root root 1721 May 12 13:23 primary.sqlite.bz2
-rw-r--r--. 1 root root  662 May 12 13:23 primary.xml.gz
-rw-r--r--. 1 root root 2685 May 12 13:23 repomd.xml

Repositories on a CDS are hosted at the same URL as for the Pulp server itself:

[root@localhost] wget --no-check-certificate https://cds.example.com/pulp/repos/demo/pulp-demo-1.0-1.fc14.x86_64.rpm
HTTP request sent, awaiting response... 200 OK
Length: 2184 (2.1K) [text/plain]
Saving to: “pulp-demo-1.0-1.fc14.x86_64.rpm”
 
100%[==============================================================>] 2,184       --.-K/s   in 0s      
 
2011-05-12 13:28:21 (24.3 MB/s) - “pulp-demo-1.0-1.fc14.x86_64.rpm” saved [2184/2184]

Binding and Distribution

When a consumer is bound to a repository, the Pulp server takes into account any CDS instances that host the repository in question. A list of all locations at which the repository is hosted is returned to the consumer and used as a mirror list for the repository. The Pulp server keeps track of these distributions and will organize these mirror lists to distribute consumer load across all hosts of a given repository. In the future, this distribution can be enhanced to take into account other factors than simple round-robin distribution and make the determination of the “best” CDS for a given repository for each consumer.

Demo

Open in New Window

Conclusion

In the middle of writing this, my boss pinged me to tell me he set up a demo environment using a Pulp server in Boston and a CDS instance in Tel Aviv. Geographic issues are just one of the issues addressed by the CDS functionality. CDS instances can be configured with only a subset of repositories to better control which content is available to which clients.

The next big CDS related feature on the horizon is the introduction of CDS groups. The implications extend beyond the simple administration benefits of being able to affect more than a single CDS at a time. The intention is that CDS instances will be able to use other CDS instances in their same group for load balancing and fail over scenarios.

In the meantime, the basic story for Content Delivery Servers is implemented and available in Pulp. This story includes the ability to register CDS instances, associate and synchronize repositories on a per-CDS basis, and passing this information to consumers when they are bound to balance repository accesses across all possible sources.