Here are some notes on provision of online services such as email, XMPP, shell accounts, DVCS and general file/rsync hosting, etc; the focus is on properly set software and decent public services. I don't have much of experience with public ones, but the notes are mostly on technologies rather than practices, aiming primarily GNU/Linux systems. For private ones, see the notes on private server setup and simpler server setup.
Usually service providers are obliged to assist governments with surveillance and/or censorship, and possibly to follow additional laws on user information handling. Which is not necessarily bad, but worse in some cases than in others (that is, getting servers confiscated, engaging into mass surveillance and/or censorship, and/or setting backdoors to enforce laws that don't make sense would be less desirable than just rarely helping with actual crime investigations, once warrant is provided and targeting individual users), so this should be investigated. Apparently the corresponding Russian law is such that it's better to keep services as far away from it as possible. Estonia provides "e-residency", which possibly may help to provide services under its laws.
Although perhaps even with an oppressive government it is possible (acceptable) to provide a service to those whom it does not affect (i.e., pretty much anyone outside), while following all the regulations, and presenting clearly what they are, as opposed to the common practices of not mentioning it at all, being security- and privacy-oriented but under the radar, in a grey area, and/or (partially) blocked. or boasting security and privacy while in fact following regulations opposing those. A slightly sarcastic presentation picturing benevolent supervisors providing a useful service by filtering "extremist content" and suchlike may also be quite fun. Among features, in place of strict privacy laws it could list some of the local nonsense, but from the point of view of a hypothetical happy citizen (though in order to keep it light, will have to pick something that sounds silly/peculiar, yet not particularly bigoted). Maybe also presenting uncertainty and instability as excitement. Although as of 2021 and in Russia, since many foreign mail servers are being blocked, it seems that delivery failure rate would be unacceptable for a mail service. And then 2022 with Russia's "special operation" happened, at which point having anything to do with Russia became an edgy choice in much of the world. Money transfers became inconvenient and limited, too; it is better to be in a sane jurisdiction, after all.
A relevant discussion (though probably there are plenty more around): "Ask HN: What is the best jurisdiction for internationally distributed teams?".
An user agreement should be prepared carefully, yet be readable.
Payment processors tend to be an issue as well, though some of their issues are just inherited from the bank cards (and most of the others – from trying to mitigate those with fraud detection). The options (e.g., PayPal) are bad, but they work sometimes, more or less.
Service abuse is what brings up some of the legal issues (and even when it doesn't, it's highly undesirable), but apparently it can be mitigated by requiring a small payment for confirmation, which is straightforward with regular bills, but viable with donations as well (e.g., as sdf.org does).
Though from the perspective of someone reporting network abuse, it seems pretty good if an abuse reporting email exists, is checked, and something is done about it at least after reporting. Probably those who don't care much just go ahead and run services without sorting out the abuse, and those who care too much don't even try to run such services; a good balance is needed.
SSH is one of the most widespread protocols with good authentication and software implementations, useful for both regular shell accounts and the ones restricted to provide specific functionality (email and DVCS, for instance), needed for pubnix-style systems.
Better isolation and restrictions than regular file permissions are desirable in systems shared among strangers. Some of the ways to set such restrictions can be observed in the hashbang/shell-server's "security" task, and here is the list I have collected:
sshd(8) can (and does by default on Debian,
          see sshd_config(5)) use pam(8), including
          session management modules such as pam_limits(8) (which
          sets ulimit and nice,
          see limits.conf(5)) and pam_namespace(8)
          (which sets polydirs such as per-user tmp directories, see
          namespace.conf(5)). These are user-space and
          not necessarily reliable, "PAM escape" via certain programs
          is possible -- so those should be limited too.
        hidepid=2
          for proc(5), newinstance
          for devpts (documented in mount(8)), etc.
          Mounting /tmp/ into memory and avoiding swap
          can be useful for both performance and security. Disk
          partition encryption with LUKS/dm-crypt would also be useful
          to reduce the risk of compromising user data, though that
          applies to computing in general.
        systemctl(1) can be used to set
          those (see systemd.resource-control(5)) and
          limit resource usage for PAM sessions. I wonder why
          hashbang.sh only seems to set that for non-interactive
          sessions.
        iptables-extensions(8) there's
          the owner extension, which allows to match
          outbound packets on local users and groups. This seems
          useful for limiting user network capabilities without
          limiting system services.
        
        For more restricted services, there may be no need in shell
        access, or in system users altogether, but other SSH uses may
        still be desired. There are SSH server libraries for that
        (e.g., libssh, or a Haskell ssh library; libssh2 may be better
        to avoid, with its rather bad track record and regularly found
        vulnerabilities; though years after writing this, I used
        libssh2 for an SFTP client, and ran into memory leaks with
        versions 1.7.0 and 1.9.0, then used libssh, and ran into an
        infinite loop in sftp_open with version 0.7.3, though
        apparently not in 0.9.8; so no feature-complete SSH library
        seems to have a particularly good track record), and many
        per-key restrictions can be defined
        in authorized_keys files or encoded into
        certificates with OpenSSH (see sshd(8) for the
        documentation), including command restrictions. It may be too
        restrictive for some programs (where the arguments should be
        dynamic), but wrappers could be used for those.
      
        Gitea, for instance, forces execution of its own command
        (via command
        in ~git/.ssh/authorized_keys for each added
        user), and disallows everything but command execution (as used
        by git), manually ensuring that commands are git ones, and
        checking repository access privileges using its own
        rules. While rsync provides the rrsync script,
        also to be set via command, only allowing rsync
        to be used, and restricting it to a certain
        directory. rssh similarly restricts commands
        available over SSH, mostly to file transfer ones.
      
VPN (IPsec, WireGuard, etc) usually provides both encryption and authentication, convenient for running simple protocols (unencrypted, maybe with host-based authentication) on top of it. Additionally, it may be convenient for connections between users.
PAM authentication may be nice to reuse for everything (possibly via SASL), especially if shell access is provided, but unfortunately it mostly aims plaintext authentication.
SASL is nice for uniform authentication across services. Usually it is not tied to system users, and can be used with LDAP (and so can PAM). See the "user authentication" note for more on the topic.
To detach users from the underlying operating system (that is, to avoid using system users), possibly using a shared user directory across multiple servers, LDAP is a common option.
Applicability of different methods depends on the kinds of data stored. Some of the common ones are rsync, database replication and other built-in/specialized backup/synchronization methods, mirroring with RAID (1 in particular), DRBD (see the section on HA).
A decent service shouldn't trap users, so horizontal scaling should be as easy as setting identical systems, relying on federated protocols for interoperation. Configuration management systems such as Ansible are useful for that. Though high availability (see the next section) usually involves redundancy, which can easily provide scaling in some cases as well.
There are nice tools for highly-available (HA) clusters around: pgpool-II (for PostgreSQL), DRBD + GFS2/OCFS2 (for a distributed filesystem), Pacemaker (for general resource management/failover, including services and automated setting of load balancing via IP multicast). All those are available from Debian repositories, and seem to be maintained, used fairly widely.
        It is rather hard to be certain that a complex system would
        function properly under unexpected loads. Stress testing
        should be performed, and other iptables
        extensions could be useful here, such
        as hashlimit to set per-IP limits.
      
Monitoring (with munin, Zabbix, or something along those lines) should be helpful for capacity planning.