Wonderer. I doubt the archives would be open for download. There is a strong possibility the format wouldn’t be recognizable by any program you have anyway. You may just to copy it the old fashion way.
The database is in SQL, which you’d need a special program to read (though it’s open source). If you knew SQL, you could probably set up a program to put the different parts of the posts together in whatever format you want, and you could probably find a lot of information about that at phpBB.com, because the software sets up the database. The gzip of the database is 216MB.
The pages don’t exist as HTML until they’re accessed; there are templates in php that make calls to the SQL database, and then generate the HTML to be displayed.
On the other hand, the database contains everyone’s password. So. . . I guess if you write the program, I’d be willing to clone the database, and, ensuring that there’s no sensitive information being extracted, run the program and send you the results. That’s the best way I know how to do it without compromising everyone’s account.
Internet access is not eternal and i just want to have this massive chunk of literature for perhaps a rainy day…
I did not know everyones passwords are in there…
Would it be possible for you to duplicate the gzip, then delete the parts with the passwords?
There must be some way to get the text this site without visiting every page…
And faust, there is no world gov yet, comon… it will be comming into existence soon though…
And they wouldn’t need to risk implementing subliminal messages, All they need to do is say that we are tinfoil hat wearing conspiracy theorists…
conspiracy theories are not proven and therefore crack pot by definition didn’t you know?
And i don’t think i have ever said there are subliminal messages anywhere…
If you’re refering to my thread about propaganda and marketing tactics, making you want to buy a cell phone with a hot woman is not subliminal, it’s just emotional association.
Your sudden appearence in this thread is suprising faust… did i really sound like i was infering some sort of social control conspiracy by asking for a copy of the database?
Yes, as far as I know, it’s possible. But, I don’t know how to do it, nor how to present it in a relatively human readable format once it’s done.
Basically, what you’d need is a script that organizes the topics by time and puts every post after the opening post for the topic, next to a user name. I think writing a script from scratch would be easiest, because the database stores usernames, topics, and posts in different rows or columns, or something like that. So, you’d have a topic with a bunch of post numbers that reference a bunch of posts, each one referencing a username. I think that’s how it works.
I have plans to figure out how MySQL works, but that’s likely to be a while. If you don’t mind that it’s totally disorganized, I might be able to get you a totally disorganized table of posts, topics, and usernames in a few weeks. But, I’m assuming it’s pretty easy, because I’m starting from scratch.
I’ll also ask around on the board software forums, there might be a ready-made solution to this.
The only thing you need to be careful is not to download pages at once because it can use up server resources and potentially crash the server.
Don’t fetch more than 2, 3 thread at a time. (Most software have setting to limit simultaneous downloads.)
You can use a simple command line tool like wget, curl, too.
Thanks, Nah. I think those offline browser programs are the best bet, not only because I don’t know how to write much of a script, but because it would save the pages in an easily readable and properly organized format. And thanks for the tip about downloading a few pages at a time.
Does that work for you, Wonderer? You can run the offline browsing programs overnight, so you should be able to get the whole thing in a few days, but that might be costly if you’re using dialup.
[quote=“Carleas”]
And thanks for the tip about downloading a few pages at a time.
[quote]
Your welcome.
I had to ban some of the most aggressive bots (these softwares are considered as bots, too) to keep the bandwidth and server performance within limits.
If many members use them at the same time, it can slow down the server, depending on the server spec and other things.
And there are some ways to cache the page data and serve them in the static form (it means the server wouldn’t be working very hard to produce the page).
Maybe phpBB has a module or add-on already written by someone.
If the resource usage is getting higher, you may want to think about page cashing and other resource saving tricks.
I know I need to update a config file or tow, because we’ve gotten some more RAM and we’re still running the old configuration. And phpBB caches pages already, so I don’t think I can get much out of that.
Optimization is pretty daunting. It seems like an art more than a science, which makes it hard to learn without a solid foundation.