Pages can be bookmarked easily, which can be incredibly important.
Pages can be bookmarked easily only sometimes! Everybody seems to have adopted the same database-centric but user-hostile (and cache/crawler-hostile) approach to pagination of posts (see: any blog platform, engadget, tumblr, etc).
Their brain damaged flow goes: You make a new post. The new post appears at the top of the front page. But, the front page has a limited number of story slots. So, the oldest story on the front page gets pushed to page two. But page two has limited slots, so the oldest story on page two goes to page three. Repeat for every page. Some sites have tens to hundreds of thousands of "pages." (Preemptive note to future comment haters: I know the site doesn't "push" articles to every page because it's all just database queries, but the effect is the same.)
Every time you make a new post, you invalidate all thousand (ten thousand? 100k? million?) older pages. It's absolutely moronic. Google has no chance of keeping up. You make one new post and Google has to re-index thousands of pages.
The proper solution is to make pages "fill up" then have your root page point to the highest numbered page. Example: http://omgpagination.tumblr.com/ would logically be page 600, then the "Older Posts" button would link to page 599. Instead, everybody right now makes / always be page 1, "Older Posts" always links to page 2, etc. But those URLs aren't content stable and every update invalidates your cache of the entire site. Your oldest posts on your site should always be on page 1, not a moving target of page {TOTAL_POSTS / POSTS_PER_PAGE}.
In short: be a little more clever and do the right thing for your caches. What's good for your caches is inevitably good for your users too.
Every time you make a new post, you invalidate all thousand (ten thousand? 100k? million?) older pages. It's absolutely moronic. Google has no chance of keeping up. You make one new post and Google has to re-index thousands of pages.
You shouldn't be allowing Google to index /page/2, /page/3, /page/4; that content is obviously going to change, and no user is ever going to search for "page 9 of the archives of example.com."
Look at blogger urls (the updated-max parameter; reddit does something similar as well) to see an efficient and bookmark-friendly implementation of pagination. I disagree that /page/4/ style urls are a product of database-driven design; that seems to be driven by url legibility, a book metaphor, and convention. Total counts and large offsets are bad for databases because they involve many rows rather than the minimum necessary local information.
You are quite right. Time and space is the important point here. There's no way of saying "Give me page 6 as it was when I bookmarked it two weeks ago."
I really like the way techmeme handles this, basically you can get a snapshot of how the front page looked like at any given time.
Makes it more easy to find things you read a few days ago but didn't bother bookmarking and then later you realize you have to find it again. "Hmm, i read about it monday morning"...click click BANG! everything there just the way you remember it.
On most other news sites i can usually not find old articles even after 15 minutes of searching with both google and the internal search.
good point, but we sometimes delete questions from Stack Overflow and that'd also invalidate all pages (depending on how old the question was), would it not?
"Question deleted" item would help for that, and would also be more transparent for users than silent deletion (at the cost of a few more pages to show all items, becouse some items are deleted).
Mmm, that's not a bad solution, but the clutter would bother me as a user. You might just remove the items from the page (20 per page, 5 deleted on this page, now only 15) but then you might get bare pages.
Now, if I want all these pages to point to the newest page -- how do I do that?
Every time when the newest page number changes (e.g. from 599 to 600), it invalidates all other pages anyway, right?
They all have to point to page 600 now.
Moreover, I have to make code, that generates link to my newest page more complex.
Another consideration: it could be beneficial from SEO perspective if content constantly shifts on older pages.
> Now, if I want all these pages to point to the newest page -- how do I do that?
You just link to the root. It's conceptually page 600, but you don't actually have to specify the number. Imagine that the blog is you writing pages sequentially in a book. The oldest content is on page one; the newest content is on the last page. You don't need an explicit number to get to the newest content if you use the convention that an unspecified number means to go to the last page.
If the root page has less than a full page of items, I would also have it load the full next page, and the "next" link would go to page 3. So your first page might have from 10 up to 19 results, then every page after that would have 10.
The other option is to use a "start" parameter which is the offset of the first item to load, and you always load that item + the next 9. However this means you have N cached pages instead of N/10.
Assume that an incrementing id is assigned to blog posts. The first page would always have the 10 most recent items. The "next" link would look like "/blog?offset=500". So when you click that, you get the items numbered 500-491, and the "next" link on that page is "/blog?offset=490". Admittedly the URLs are ugly, but the advantage is that the contents don't change when items are added, only when they are deleted (the same can't be said for the page=N scheme).
In this scenario up to 9 items on second page may overlap with items on the first page, right?
That would be confusing to users.
Besides, I still see no business advantage of making content hardwired to certain page. If anything, it's better to change page content from SEO perspective.
YES! Sometimes i read a few pages then get back the next day and click next and suddenly page 5 contains what was on page 1 yesterday. See http://www.yankodesign.com/page/2/ for example of this madness.
Pages can be bookmarked easily only sometimes! Everybody seems to have adopted the same database-centric but user-hostile (and cache/crawler-hostile) approach to pagination of posts (see: any blog platform, engadget, tumblr, etc).
Their brain damaged flow goes: You make a new post. The new post appears at the top of the front page. But, the front page has a limited number of story slots. So, the oldest story on the front page gets pushed to page two. But page two has limited slots, so the oldest story on page two goes to page three. Repeat for every page. Some sites have tens to hundreds of thousands of "pages." (Preemptive note to future comment haters: I know the site doesn't "push" articles to every page because it's all just database queries, but the effect is the same.)
Every time you make a new post, you invalidate all thousand (ten thousand? 100k? million?) older pages. It's absolutely moronic. Google has no chance of keeping up. You make one new post and Google has to re-index thousands of pages.
The proper solution is to make pages "fill up" then have your root page point to the highest numbered page. Example: http://omgpagination.tumblr.com/ would logically be page 600, then the "Older Posts" button would link to page 599. Instead, everybody right now makes / always be page 1, "Older Posts" always links to page 2, etc. But those URLs aren't content stable and every update invalidates your cache of the entire site. Your oldest posts on your site should always be on page 1, not a moving target of page {TOTAL_POSTS / POSTS_PER_PAGE}.
In short: be a little more clever and do the right thing for your caches. What's good for your caches is inevitably good for your users too.