Commit Graph

92 Commits

Author SHA1 Message Date
Deimos
88944bed17 Run app-related services under the app user 2020-11-30 20:31:14 -07:00
Deimos
5fbc72c44c Add ability to process posts with Lua scripts
This adds the backend pieces (no interface yet) to configure Lua scripts
that will be applied to topics and comments due to different events.
Initially, it only supports running a script when a new topic or comment
is posted. For example, here is a Lua script that would prepend a new
topic's title with "[Text] " or "[Link] " depending on its type, as well
as replace its tags with either "text" or "link":

function on_topic_post (topic)
    if (topic.is_text_type) then
        topic.title = "[Text] " .. topic.title
        topic.tags = {"text"}
    elseif (topic.is_link_type) then
        topic.title = "[Link] " .. topic.title
        topic.tags = {"link"}
    end
end

There can be a global script as well as group-specific scripts, and the
scripts are sandboxed, with limited access to data as well as being
restricted to a subset of Lua's built-in functions. The Lua sandboxing
code comes from Splash (https://github.com/scrapinghub/splash). It will
need to be modified, but this commit keeps it unmodified so that future
changes can be more easily tracked by comparing to the original state of
the file.

The sandboxing also includes some restrictions on number of instructions
and memory usage, but this might be more effectively managed on the OS
level. More research will still need to be done on security and resource
restrictions before this feature can be safely opened to users.
2020-11-30 17:05:00 -07:00
Deimos
f4933be2dd Nginx: remove /development redirects on Docs site
These were set up to redirect the original locations of the development
pages to their new locations inside the instructions folder, but can't
be used any more now that we're creating a development folder.
2020-10-09 16:50:11 -06:00
Deimos
dd00e2e79c Add invoke tab-completion script to dev .bashrc
This enables tab-completion for the new invoke tasks in the dev version.
So for example, you can type "invoke ty<Tab>" and it will complete to
"type-checking".
2020-10-07 15:39:14 -06:00
Deimos
68870119f4 Remove remnants of Redis breached-passwords check
We've been using pts_lbsearch on the text file for a few weeks now, and
it's working fine. Checks generally seem to take about 10 ms, and that's
totally fine for the relatively uncommon events of registrations and
password changes.

This removes everything related to the previous Redis-based method,
which means we no longer need the second Redis server or the ReBloom
module.
2020-09-06 18:32:10 -06:00
Deimos
26b1d4dd9b Use pts_lbsearch to check for breached passwords
This replaces the current method of using a Bloom filter in Redis to
check for breached passwords with searching the text file directly using
pts_lbsearch (https://github.com/pts/pts-line-bisect/).

I'm not removing the Redis-based method yet because I want to test the
performance of this first, but this is *far* simpler and doesn't have
the possibility for false positives like the Bloom filter does.
2020-08-11 18:27:16 -06:00
Andrew Shu
87dce83f26 Install html5validator, validate HTML in tests
Installs the Nu Html Checker and starts using it to validate the home
page's HTML: https://validator.github.io/validator/

Also includes fixes to some lists that were nested in an invalid way.
2020-08-02 19:16:52 -06:00
Deimos
6f272fcd54 Revert "Build HTML Tidy, validate homepage HTML in tests"
This reverts commit cb7be83877.

HTML Tidy seems to have various gaps in its validation that we've found
already, including one that's pretty much a deal-breaker for Tildes's
HTML: it doesn't think that <menu> is a valid parent for <li>.

We're looking at alternative validators still.
2020-08-02 14:20:37 -06:00
Andrew Shu
cb7be83877 Build HTML Tidy, validate homepage HTML in tests
Adds the HTML Tidy library to the dev version, along with the pytidylib
wrapper for it, and a couple of tests that use it to validate the HTML
of the home page.

Includes a fix to the GitLab "Planned features" link that Tidy considers
invalid because it includes some un-encoded characters.
2020-08-01 14:20:57 -06:00
Deimos
9531221b88 Salt: don't attempt to set mode on site-icons.css
Trying to change the mode of this file (which often already exists)
fails on Windows. It seems fine to just not set it and let it be set to
the default.
2020-07-20 14:56:14 -06:00
Deimos
e85dfa2492 Salt: ensure that the site-icons.css file exists
The generate_site_icons_css cronjob will create this file, but the site
won't work before it exists, so there's a (less than 5 min) gap where
the site is broken when first set up. This probably won't be noticeable
in dev/prod setups, but breaks things like CI setups where everything is
getting created freshly each time.

This makes sure that the file always exists on initial setup and
whenever the Salt states are re-run.
2020-07-12 14:33:15 -06:00
Deimos
384c5c985f Salt: move postgresql-redis bridge to own state 2020-05-15 16:45:46 -06:00
Deimos
78002847ba Fix environment check in Prometheus config
Checking for prod isn't correct - we want the monitoring server to have
these entries so that it can scrape them from prod.
2020-05-15 16:08:39 -06:00
Deimos
b011be34ef Add simple metrics to event stream consumer jobs
This adds some very simple metrics to all of the background jobs that
consume the event streams. Currently, the only "real" metric is a
counter tracking how many messages have been processed by that consumer,
but a lot of the value will come from being able to utilize the
automatic "up" metric provided by Prometheus to monitor and make sure
that all of the jobs are running.

I decided to use ports starting from 25010 for these jobs - this is
completely arbitrary, it's just a fairly large range of unassigned
ports, so shouldn't conflict with anything.

I'm not a fan of how much hard-coding is involved here for the different
ports and jobs in the Prometheus config, but it's also not a big deal.
2020-05-14 18:19:26 -06:00
Deimos
42f99a82ba Add temporary bans (manual)
This enables me to set a ban expiry time for a user (manually, in the
database). By doing so:

* The user's page will say that they're temporarily banned, and show the
  date their ban will be lifted.
* If the user tries to log in, it will say they're temporarily banned,
  and give a specific datetime that the ban will be lifted by.
* An hourly cronjob will lift any bans that have expired.
2020-05-09 14:25:20 -06:00
Deimos
9cd86ad33d Salt pillar: update Prometheus IP to IPv6
I've switched the Prometheus server to communicate over IPv6 now, so
this needs to be updated to make the nginx configuration correct.
2020-04-14 18:39:12 -06:00
Deimos
f2c0b68f78 Monitoring server: add blackbox exporter
This is a prometheus exporter that allows checking IPv4 and IPv6
responses, among other things. This sets it up to make sure that the
site is responding over both IPv4 and IPv6, so that I can monitor and
set up an alert if either stops working.
2020-04-03 17:49:20 -06:00
Deimos
7d1c3297fb Add group_stats table, track daily topics/comments
This adds a group_stats table and cronjob that will insert the previous
day's stats into it each day just after 00:00 UTC.
2020-03-09 17:57:43 -06:00
Deimos
01752141fc Salt: set overall server timezone to UTC 2020-03-05 20:55:12 -07:00
Deimos
89c7c13be2 Reload gunicorn when site-icons CSS updates
This starts using webassets for the site-icons.css file inside the base
template so that a cache-busting "version" string is added after the
filename as a query variable (as was already being done with the other
CSS and JS files).

It also creates a new service that's triggered by a "path changed" event
on site-icons.css, which causes gunicorn to reload. This should mean
that whenever the site-icons.css file is updated by the cronjob that
generates it, gunicorn will automatically reload and update the
cache-busting string for the CSS file, causing users' browsers to update
to the newest version.
2020-02-12 21:23:30 -07:00
Deimos
078ca207f9 Apply PGTune recommendations to PostgreSQL in prod
This is just using the recommendations from PGTune for a web application
being hosted on a server with the prod server's specs. I'm sure they're
not the best values, but should be better than the defaults.
2020-01-28 17:13:44 -07:00
Deimos
3811ec3924 Eliminate RabbitMQ
This removes RabbitMQ as well as everything else attached to it:
Erlang; the Prometheus collector; the pg-amqp-bridge and all PostgreSQL
functions and triggers; and the amqpy Python package and the Tildes code
that used it.

Note that this commit does not actually uninstall or delete any of these
packages or services, so if you have a running instance that you want to
keep (instead of re-provisioning from scratch), you will need to
manually remove them if you want them completely gone.
2020-01-20 17:28:16 -07:00
Deimos
bcb5a3e079 Replace RabbitMQ uses with Redis streams
RabbitMQ was used to support asynchronous/background processing tasks,
such as determining word count for text topics and scraping the
destinations or relevant APIs for link topics. This commit replaces
RabbitMQ's role (as the message broker) with Redis streams.

This included building a new "PostgreSQL to Redis bridge" that takes
over the previous role of pg-amqp-bridge: listening for NOTIFY messages
on a particular PostgreSQL channel and translating them to messages in
appropriate Redis streams.

One particular change of note is that the names of message "sources"
were adjusted a little and standardized. For example, the routing key
for a message caused by a new comment was previously "comment.created",
but is now "comments.insert". Similarly, "comment.edited" became
"comments.update.markdown". The new naming scheme uses the table name,
proper name for the SQL operation, and column name instead of the
previous unpredictable terms.
2020-01-20 13:17:33 -07:00
Deimos
c0caec62c9 Upgrade Redis to 5.0.7 and update redis.conf 2020-01-20 12:55:31 -07:00
Deimos
a47517e2b8 Move gunicorn server config out of INI files
gunicorn 20.0.0 included a change so that it will no longer read server
configuration out of Paster files. Because of this, the settings for it
in development.ini and production.ini were no longer being used. This
resulted in the auto-reloading no longer working in dev, and the number
of workers being reduced back down to 1 in production. The socket/PID
may have been impacted as well.

This commit moves the configuration into command-line args used to
launch gunicorn, and uses a pillar variable to handle the args different
between dev and prod.
2019-12-17 15:46:45 -07:00
Deimos
d2605215ca Upgrade Python version to 3.8
The "noqa" comments intended for getting the mccabe tool to ignore a
method's complexity needed to be moved as part of this, for some reason.
2019-12-16 20:09:27 -07:00
Deimos
5c1cf3975d Update cmark-gfm to 0.29.0 2019-12-16 18:16:15 -07:00
Deimos
282df2bf02 Block SemrushBot in nginx (it ignores robots.txt) 2019-12-04 18:27:20 -07:00
Deimos
7fd1c3e72e Close voting after 30 days, delete vote records
This makes it so that posts (both topics and comments) can no longer be
voted on after they're over 30 days old. An hourly cronjob makes this
"official" by updating a flag on the post indicating that voting is
closed. The daily clean_private_data script then deletes all individual
vote records for posts with closed voting, and the triggers on the
voting tables have been updated to not decrement the vote totals when
these deletions happen.

The net result of this is that Tildes only stores users' votes for a
maximum of 30 days, removing a lot of sensitive/private data that builds
up over the long term.
2019-11-20 21:05:47 -07:00
Deimos
1974d44de9 Redirect /donate to the page on the Docs site 2019-11-04 18:25:59 -07:00
Deimos
4f29ceba0b Salt: fix incorrect dependency for boussole
This didn't get updated when boussole was split out to its own
virtualenv, and was still being linked to the pip installs from the
application succeeding.
2019-10-25 17:48:29 -06:00
Deimos
d01e37d2dc Change ownership of /opt/venvs + run pip non-root
Previously, the virtualenvs were owned by root and the pip installs were
done as root as well. This worked fine, but it meant that I can't use
pip-tools' pip-sync function without sudo. This makes it simpler by
giving ownership to the app user (tildes in prod, vagrant in dev).
2019-10-24 22:40:14 -06:00
Deimos
31de48e447 Start using pip-tools and split dev dependencies
I'm going to start using pip-tools to manage dependencies:
https://github.com/jazzband/pip-tools

This makes updating the dependencies and virtualenv easier in a few
ways, and makes it simple to keep dev dependencies split out (so I can
stop installing them in production).

Now, to do a check and update all packages to their newest versions, the
main command is:

pip-compile --no-header --upgrade requirements.in

and again with requirements-dev.in to update that one as well. This will
update all the package versions in requirements.txt and
requirements-dev.txt. The virtualenv can then be updated to match those
versions by running:

pip-sync requirements.txt

(or requirements-dev.txt for dev environment). This currently needs to
be run with sudo, but I'm going to try to fix that shortly.
2019-10-24 15:15:02 -06:00
Deimos
5c3c7c4691 Fix hardcoded Python version in site-packages path 2019-10-22 18:15:45 -06:00
Deimos
e1e62bcf3c PostgreSQL: Install PL/Python, add basic function
This installs PL/Python (specifically plpython3u), enables it in the
database, and creates a function id36_to_id that calls the Python
function with the same name inside the tildes.lib.id module. This will
enable doing queries similar to this, when I have a topic's ID36 from
the site:

SELECT * FROM topics WHERE topic_id = id36_to_id('asdf');

The fact that this was possible to set up without having to port the
id36_to_id logic to a different language is blowing my mind a little.
There are some really interesting possibilities from being able to
import all of the Python code into the database itself.
2019-10-22 17:59:18 -06:00
Deimos
ca509b220b Update PostgreSQL version to 12
Changing these pillar values are the only actual changes to Tildes
code/config needed, but if you're upgrading an existing version from 10
to 12 you will need to do some manual steps. The below should cover it -
lines starting with a * are descriptions of things you need to do, while
the rest are actual commands to run:

sudo apt-get install postgresql-12

sudo systemctl stop postgresql@10-main.service

sudo systemctl stop postgresql@12-main.service

cd /var/lib/postgresql

sudo -u postgres /usr/lib/postgresql/12/bin/pg_upgrade -b /usr/lib/postgresql/10/bin/ -B /usr/lib/postgresql/12/bin/ -d /var/lib/postgresql/10/main/ -D /var/lib/postgresql/12/main/ -o '-c config_file=/etc/postgresql/10/main/postgresql.conf' -O '-c config_file=/etc/postgresql/12/main/postgresql.conf'

* Change pillar value to 12, and run salt

sudo systemctl stop postgresql@10-main.service

* Edit /etc/postgresql/12/main/postgresql.conf and change port to 5432

sudo systemctl restart postgresql@12-main.service

sudo -u postgres ./analyze_new_cluster.sh

* After verifying the new version seems to be working, clean up the old version:

sudo apt-get remove postgresql-10
sudo rm -rf /usr/lib/postgresql/10/
sudo rm -rf /var/lib/postgresql/10/
sudo rm -rf /etc/postgresql/10/
2019-10-20 12:47:45 -06:00
Deimos
1c34ca4f76 Add scheduled topics (no UI yet)
This adds the backend for scheduled topics, which can be set up to post
at a certain time and then (optionally) repeat on a schedule.

Currently, these topics must be text topics, and can have their title,
markdown, and tags set up. They can be configured to be posted by a
particular user, but if no user is chosen they will be posted by a
(newly added) generic user named "Tildes" that is intended to be used
for "owning" automatic actions like this.
2019-10-03 23:21:42 -06:00
Deimos
63b935927a Add frame-src to CSP for Stripe
The Stripe Checkout redirect was getting blocked by the Content Security
Policy, and requires being allowed through frame-src like this.
2019-09-20 15:23:46 -06:00
Deimos
6819b1917e Fix Content-Security-Policy header
Apparently add_header inside a location block doesn't... you know,
actually work. This should be reasonable, but I'd still rather only
allow the Stripe JS on the single page where it's necessary.
2019-09-20 12:50:05 -06:00
Deimos
ee934105b7 Add new version of Stripe Checkout for donating 2019-09-20 12:22:52 -06:00
Deimos
bc527b0c70 Serve Tildes robots.txt on tild.es as well 2019-08-31 18:39:45 -06:00
Deimos
d2889ef606 Add redirects for development pages on Docs 2019-08-21 12:26:28 -06:00
Deimos
d306d879ca Docs site: add redirects for old urls 2019-08-08 21:32:49 -06:00
Deimos
ff94f095b6 Fix run/restart method for transparent_hugepage
The previous method of doing this could cause redis to try to start up
(via restart) earlier than it should. By using require_in and watch_in,
it should now only start up in the first place once this service has
been started first, and it will also cause redis to restart if it ever
needs to run again in the future.
2019-07-05 20:05:35 -06:00
Deimos
ac4e8e9b54 Prevent symlink creation in npm install
Vagrant on Windows has issues with creating symlinks inside shared
folders - it requires a permission that isn't granted to a user by
default. This can be fixed by changing security policies, but for our
purposes we don't need the symlinks anyway, and can run the tools
manually like this, instead of using the .bin/ symlinks.
2019-07-05 17:22:45 -06:00
Deimos
730f0ea69a Only apply nginx ratelimit in prod
This is interfering with a few things in dev, including the debug
toolbar (since all of its assets are served by the app).
2019-06-29 14:34:10 -06:00
Deimos
7f518add82 Salt: add monitoring server IP to prod pillar 2019-06-20 21:55:39 -06:00
Deimos
9a373f4cbf Change tild.es to nginx redirect instead of proxy
Previously tild.es urls would proxy_pass through to the views inside the
Pyramid app, but this caused strange behavior in some cases. For
example, anything that caused a 404 response would end up in a broken
page that still appeared to be on the tild.es domain, but would be an
HTML-only page coming from the app, since the CSS and JS would not be
available.

This method is still a bit weird in some ways (now you'll end up on a
404 page at https://tildes.net/shortener/... instead), but I think it's
an improvement overall.
2019-06-20 19:28:38 -06:00
Deimos
5e1197b0c6 Base activity sorting on "interesting" activity
This changes the "activity" topic-sorting method to look for
"interesting" activity instead of everything, and adds a new "All
activity" method that retains the previous behavior.

Currently, "interesting activity" excludes any comments that have active
Noise, Offtopic, or Malice labels, or any of their children. These
checks are also done based on labeling activity, so for example if
someone posts a new comment it will bump the thread initially, but if
that comment is then labeled as Noise, the thread will "un-bump" and go
back to its previous position in the Activity sort.

There were also some other minor changes made to appearance to support
adding another sorting option, such as shortening the displayed names on
the "tabs", like showing "Votes" instead of "Most votes". This probably
needs some further work, but is okay for now.
2019-06-12 14:33:11 -06:00
Deimos
ce512c5f40 nginx: add rate-limit for requests to Pyramid
This won't affect requests for static files or anything except ones that
get proxied to the app.

The current configuration is based on IP, and allows a rate of 4/sec,
with an additional burst of 5 above the limit permitted, and burst
requests allowed to go through immediately (nodelay). For more info:
https://www.nginx.com/blog/rate-limiting-nginx/
2019-06-04 22:16:02 -06:00