Frequently asked questions for CSRankings.org.
|
---|
Rankings are intensely popular and influential. While we might wish for a world without rankings, wishing will not make rankings go away. Given this state of affairs, it makes sense to aim for a ranking system that is meaningful and transparent. Unfortunately, the most influential rankings right now are those from US News and World Report, which is entirely reputation-based and relies on surveys sent to department heads and directors of graduate studies. By contrast, CSRankings is entirely metrics-based: it weighs departments by their presence at the most prestigious publication venues. This approach is intended to be both incentive-aligned (faculty already aim to publish at top venues) and difficult to game, since publishing in such conferences is difficult. It is admittedly bean-counting, but its intent is to "count the right beans." It is also entirely transparent; all code and data are publicly available at https://github.com/emeryberger/CSRankings under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (note: this means you may not distribute anything built from CSrankings' code or data).
|
|
Unfortunately, citation-based metrics have been repeatedly shown to be subject to manipulation. There are universities instructing faculty to cite each other, and the phenomenon of "citation cartels" is well documented. There are also methodological challenges: citations for all papers are not freely available and change rapidly, and citation count systems like Google Scholar do not do a great job of disambiguating authors and can be gamed by authors. (See Et al.'s page for a humorous example.) Note that selective conferences are already a proxy for citation impact: papers published at these conferences are on average much more highly cited than papers that appear in less selective, less prestigious venues. |
|
Adjusted counts: each publication is counted exactly once, with credit adjusted by splitting evenly across all co-authors. This approach makes it impossible to boost rankings simply by adding authors to a paper. Average count is the geometric mean of the adjusted counts per area (for n areas selected, this is the nth root of the product of all adjusted counts (+ 1)). $$averageCount = \sqrt[N]{\prod_{i=1}^N(adjustedCounts_i + 1)}$$ This computation implicitly normalizes for publication rates and sizes of areas. Note that publications must be at least 6 pages long to be counted. |
|
Nearly all categories are based on research-focused ACM SIGs. Areas not represented by ACM SIGs are intended to span most established research-centric areas of computer science. |
|
For any research-focused area to be included, at least 50 R1 institutions must have publications in the top conferences in that area in the last 10 years. This threshold is to ensure that there is enough research activity in an area to enable a meaningful ranking. A number of ACM SIGs do not meet this criteria. |
|
The conferences listed were developed in consultation with faculty across a range of institutions, including via community surveys. |
|
Only the very top conferences in each area are listed. All conferences listed must be roughly equivalent in terms of number of submissions, selectivity and impact to avoid creating incentives to target less selective conferences. |
|
Additional conferences are not listed when they are not roughly equivalent to the rest in terms of number of submissions, selectivity and citation impact. For example: in the area of programming languages, PLDI and POPL currently get roughly 300 and 220 submissions each year, respectively. Their acceptance rates over the last 10 years are 20% and 21%, while their citation impacts (measured by h5-median, via Google Scholar) are 69 and 65 (higher is better). For illustration, here are the stats for other conferences in this area which did not make the cut:
|
|
A single faculty member gets 1/N credit for a paper, where N is the number of authors, regardless of their affiliation or status (faculty, student, or otherwise). The number never changes. A paper can count for at most 1.0, in the case that all authors are / end up becoming faculty in the database. The key downside to counting papers without adjusting for authors is that it would make it trivial to inflate the effect of writing a single paper simply by adding authors. Splitting authorship credit means that authors are incentivized to appropriately treat authorship credits. Note that publication rates are normalized across areas. |
|
Here are some of the numerous downsides of only including authors present in the database:
|
|
The criteria for inclusion are that anyone who is a full-time, tenure-track faculty member on a given campus who can solely advise PhD students in Computer Science can be included in the database. This approach thus extends the reach of the database to a number of faculty from other departments who have adjunct appointments with a CS department or similar that let them advise CS PhD students. |
|
As mentioned above, tenure-track faculty who can advise PhD students in CS can be included regardless of their home department. The primary audience of CSRankings is prospective graduate students who are seeking a postgraduate degree in Computer Science. |
|
CSrankings uses DBLP as its data source, and DBLP does not currently index general science journals (including Science, Nature, and PNAS). |
|
Submit a pull request for the CSrankings GitHub repo. More details/a>, tutorial on pull requests here. Make sure that faculty members' names correspond to their DBLP author entries. Please also read this guide to contributing before submitting any proposed changes. |