Just a quick post to explain what the Rebuild/Verify/Match (r-v-m) routine is... Firstly, CentOS has 2 types of packages and rpms. The first is what goes into the main distro - and contains sources that are from 'upstream'. The second lot of packages are things we add on, people contribute and are hosted in Extras/, CentOSPlus/ and Contrib/ repositories. This second lot of packages dont actually make it into the distribution.
For packages that come down the line from upstream, they go through a fairly automated process and are built, processed and put into a queue. Once in the queue, they are pushed through a 'test-harness'. Which does some basic things, like make sure new packages are actually higher Version-Release than packages with the same name in the existing repo, make sure the packages are signed properly and that their multi-lib shared content matches etc. Usual regular stuff. An important thing to keep in mind here is that packages are built in the order they are seen publicly. If there are multiple packages on the same day, then build order is usually timestamp[1] driven.
This process works for updates.
When there is a new release to be done, things get a bit more interesting - specially for new packages that have never been released before. They might have been built more than once, and each time in an evolving buildroot. Contents of those buildroots are impossible to guess. So a new package, say XX released as XX-1.2.3-1.el5 might have been built inhouse, upstream, from 1.2.0 to 1.2.1, 1.2.2 and 1.2.3 but only the last one is released, and in some cases bundles of these packages have overlapping requirements. Hello cyclic deps. Sorting these out is a totally manual process, and can take a bit of time, as is easy to imagine. Another thing that needs to be done is to make sure that our entire tree matches upstream as much as possible. This is the r-v-m routine. Each rpm is checked against upstream's ( there are quite a few tests done ). and depending on outcome cycled back into the build queue or passed through as matched.
For every package that goes back, there is a process that guess's and tries to establish what the reason for fail was, and howto work around it - almost 90%[2] of the cases, its down to different versions of packages in the buildroots. And guessing what should be in there is almost all a manual process. Also, once there are some estimated lists in place - building specific repo's for each package's build run takes time, in many cases longer than the buildrun itself!
It should be called verify-rebuild-match, but when I wrote this script about 2 years back, there must have been a good reason to call it rebuild-verify-match, I just dont remember what that reason is anymore. Also, there are a few scripts around that do various things, but the process overall is mostly manual. Something to think about for when 5.4 comes around - or even 4.8, which is a few months away now.
btw, not all packages goto 100% Perfect match before being usable. eg mismatch in docs generated at buildtime from source due to text alignment is a classic case that Johnny and I have almost always ignored. Sometimes its down the ver of the doc render app, and is resolveable - but only if there is time and were not already running behind time :)
- KB
[1]- There are some very complex exceptions to this as well, but generally timestamp is the way its done.
[2]- I've not actually counted them, the 90% is just a way of saying : most of the time.