About / FAQ 3 October 2022
I find the concept of building crawlers intriguing. For me, this website is currently more of an intellectual exercise than something I think will take off. My goal is to figure out how to effectively discover and crawl directories, index the data in a usable manner, and present it in a helpful way.
What are Open Directories? An "Open Directory" is the term given to any computer that allows a browser to remotely view and access files. These files might be shared intentionally, maliciously, or accidentally. Most Open Directories are simply web servers configured to let their web directories be viewable. When you visit a website with images, the images are usually not stored in the same folder as the page you're viewing. Instead, they're stored in a separate folder. If you examine the site's code, you might see something like "img src='example.com/img/example.png'" used to display an image on the page. This indicates that the website has an /img/ folder. You can copy the URL, remove the example.png from it, and paste it into a browser. If the webserver allows it, you'll see a list of all images in the /img/ folder. This would be an open directory. Sometimes this is due to misconfiguration and not intentional; other times, it is completely on purpose.
What can be found in Open Directories? The answer is literally anything. Any file you could possibly imagine exists out there, openly available for you to view. You just need to know what you're looking for and where or how to search for it. Movies, songs, audiobooks... again, literally anything.
Do you host any files yourself? No, I neither host nor vouch for any of the files you may find on this site. I merely index their existence and allow you to search through my index. I have no idea whether any of the files you download are legitimate or free of viruses. I generally recommend sticking to files that can be viewed in the browser as opposed to those you have to download. If you're searching for a video, for example, you could add "mp4" to the end of your search. MP4 files can be played in your browser, while MKV, another popular file type for videos, cannot be streamed. You have to download the entire file first before you can view it.
I clicked on a link, but no file was downloaded, or I got an unauthorized error. What's up? These links are only valid while the server is still actively hosting the files at the same paths. If the server rearranges their files or removes permission to view them, the links may become inoperable. That's why I include a timestamp. Older files in the index are more likely not to work, while newer ones are more likely to be functional.
Do servers hosting the files track you when you download files? I would say, most likely. If not actively, then at least passively through webserver access logs. It doesn't mean they are actively monitoring or even aware of it.
Should I use a VPN or TOR to view these files? That decision depends on your individual risk profile. Personally, I use the internet without a VPN, sometimes to the detriment of my ISP, Cox Cable. I've scanned millions of files and open ports from my home internet. That said, I keep receiving letters from Cox urging me to behave, so maybe it's not the best approach.
Can I request that files be removed from the index? Sure, search the index for the files you want removed, export the CSV of the search, and then send me a message on one of the social media platforms I have linked. Provide me with the CSV, the details of the rows you want removed, and your justification for their removal. I won't waste time removing links without good cause. Keep in mind, I don't host any files myself, so removing them from the index won't take the file off the internet. For that, you would need to contact the server owners directly.