v0.1.0
The first public release of reddit: the full command surface, the reddit library, the .json view, and the crawl pipeline.
The first public release. reddit is a single pure-Go binary that turns the
public .json view of Reddit into structured records: list a subreddit, read a
comment tree, look up users and communities, search, pull community metadata,
and crawl in bulk. It talks to www.reddit.com over plain HTTPS with no API key
and no account, so there is nothing to register and nothing to pay for.
What you get
- Read listings.
reddit postswalks a subreddit byhot,new,top,rising, orcontroversial, across as many pages as you ask for, andpostfetches individual links by id or URL. - Read comment trees.
reddit commentsflattens a discussion into one record per comment, keeping depth and parent links, with--expandto follow the collapsed "load more" stubs through the morechildren endpoint. - Look up profiles.
subredditanduserreturn structured records, anduser-postsanduser-commentslist what a person submitted and wrote. - Search and discover.
searchqueries posts site-wide or inside one community, andsubredditsandusersdiscover communities and people by name. - Read community metadata.
rules,mods,wiki,wiki-pages, andduplicatesread the data around a community. - Classify offline.
idturns any URL or id into its (kind, id) pair without a request, following Reddit's "thing" types. - Crawl in bulk.
seedemits post URLs from listings,crawldrains the queue into a local SQLite store, anddbinspects and exports what you collected.cachemanages the on-disk page cache.
The .json view
Every public Reddit page has a .json twin. reddit reads that view directly, so
it needs no API token and no registered app for read-only work. It knows the
shape and pagination of each endpoint (listings, comment pages, about pages,
search, rules, moderators, wiki, duplicates) and walks the right one from a name
or URL.
Polite by default, and the block reality
reddit waits two seconds between requests and runs two workers by default, and
sends a descriptive User-Agent, because Reddit rate-limits aggressive and
generic clients the hardest. When Reddit answers with a rate-limit page, a
403, or its "whoa there, pardner" interstitial, reddit exits cleanly with code
5 and the hint suggests slowing down or passing --cookies to lend a signed-in
session. Datacenter and shared IPs are blocked the hardest. See
troubleshooting.
The crawl pipeline
For more than a page at a time, the pipeline is seed to discover, crawl to
fetch and parse, and db to export. Everything lands in one SQLite file under
the data dir, with a content-addressed gzip page cache beside it so re-runs do
not re-fetch unchanged pages.
The reddit library
The parsing and fetching live in their own package so you can read Reddit pages from your own program without the CLI:
import "github.com/tamnd/reddit-cli/reddit"
c := reddit.NewClient(reddit.DefaultConfig())
posts, err := c.Posts(ctx, "golang", reddit.ListingParams{Sort: "top", Limit: 25}, 1)
if err != nil {
log.Fatal(err)
}
for _, p := range posts {
fmt.Println(p.Score, p.Title)
}
Independent and public-data only
reddit is an independent, open-source tool. It is not affiliated with, endorsed by, or sponsored by Reddit, Inc. It reads only public pages, at a polite default rate.
Install
go install github.com/tamnd/reddit-cli/cmd/reddit@latest
Prebuilt archives for Linux, macOS, Windows, and FreeBSD, plus Linux packages (deb, rpm, apk), SBOMs, and cosign-signed checksums, are on the release page. There is also a Homebrew cask and a Scoop entry:
brew install --cask tamnd/tap/reddit
The multi-arch container image is on GHCR:
docker run --rm ghcr.io/tamnd/reddit:0.1.0 posts golang
The binary is pure Go (CGO_ENABLED=0) with no runtime dependencies.