converge/pkg/server/ui/about.templ

package ui

templ About() {
  <div>
     <h1>about</h1>
      <p>
          Converge is a utility for troubleshooting builds on continuous integration servers.
          It solves a common problem where the cause of job failure is difficult to determine.
          This is complicated further by the fact that build jobs are usually run on a build
          farm where there is no access to the build agents or in more modern envrionments when
          jobs are run in ephemeral containers.
      </p>

      <p>
          With Converge it is possible to get remote shell access to such jobs. This works
          by configuring the build job to connect to a Converge server using an agent program.
          The agent program can be downloaded from within the CI job using curl or wget.
          Next, an end-user can connect to the Converge server, a rendez-vous server, that connects
          the client and server together based on a common identifier specified by both client and
          server.
      </p>


      <h2>how it works</h2>

      <p>
      The basic principle of converge is described below. Access to a running remote continous integration
      job is usually not possible without a lot of access to the backend environment where jobs are running.
      However, the job can connect to a server running outside, and so can the client.
      </p>


      <div>
      <img src="../static/images/converge.svg" style="max-width: 800px"/>
      </div>

      The connection between
      client and agent is established as follows:
      <ul>
          <li>(1): the agent, started by the continuous integration job, connects to converge server through a websocket, this establishes a connection that
          is  similar to a TCP connection. In connecting, the agent specifies a
          rendez-vous id. After connecting, the agenta and Converge server perform multiplexing of connections
          over this single connection. This allows the agent to run an embedded SSH server and listen for incoming
          connections, just like normally is done with a TCP listener. </li>
          <li>(2): the client connects to converge server through SSH and also specifies the same rendez-vous id.
          Since SSH by itself cannot connect over websockets, a helper program <code>wsproxy</code> is used as
          a proxy command for SSH. Using <code>wsproxy</code>, the rendez-vous id is passed to the server as part
          of the websocket URL. </li>
          <li>(3): converge server connects the two connections after matching them based on the rendez-vous id.
          Now when a connection is setup from a client, it can connect to the appropriate agent, identifie dby
           rendez-vous id and setup a bi-directional connection. After this, Converge simply copies data between
           client and agent. </li>
          <li>(4): the agent runs an embedded SSH server and incoming connections to the agent are handed over to
             that server. At this moment an end-to-end SSH session is established.  </li>
          <li>(5): The agent spawns a shell that receives input from the user. Output from the shell is communicated
          back over the SSH session. The shell can be any shell (bash, cmd.exe, powershell.exe) or in fact any process.
          At this point, the user is connected to a remote shell running in the continuous integration job.
          </li>
      </ul>

      <p>There are a few special situations:
      <ul>
        <li> If no rendez-vous id is specified than a rendez-vous id is generated. </li>
        <li> If the agent uses an id already in use by another agent, then converge server will
             generate a new rendez-vous id. </li>
      </ul>
      The agent will always print the rendez-vous id and command required to connect to it.
      </p>

      <h2>security</h2>

      <p>
          The setup is such that the connection from client (end-user) to server (agent on CI job)
          is end-to-end encrypted. The Converge server itself is no more than a bitpipe which pumps
          data between client and agent.
      </p>

      <p>Using authorized keys is a secure way of connecting. When running the agent, the authorized keys
         must be put in a file, allowing only the designated users to connect. The file containing authorized keys
         can also be edited during a session with the agent, allowing more people to be added when required without
         having to start over again.
         Using authorized keys is made easy through the
         <a href="usage.html">usage</a> page, which provides the exact commands to execute based
         on the target environment. If users are hesitant to use their public key it is also possible
         to generate a separate ssh key-pair using <code>ssh-keygen</code> and use that instead.
      </p>

      <p>To be able to use Converge, you must already have access to the configuration of a build job.
         Having that access means it is possible to execute any command on a build agent. The Converge
         agent is started by the build job and does not have any additional rights compared to what you
         could script in the continous integration job definition.
      </p>

      <p>Converge does not provide any stealth features to hide it. The public sessions page show all
         agents and clients including details about the clients and the agents. The idea is that it should
         be light-weight and easy to use. There is no reason to hide the fact that someone is debugging
         a continuous integration job. Also, all sessions are logged,both using standard kubernetes tooling
         such as (fluentbit/filebeat, and loki/elasticsearch depending on the environment). This logging includes
         only the details about the sessions, but not what the user is doing inside a session. Also, Converge
         provides a prometheus metrics endpoint which allows user sessions to be tracked over time after
         the fact. Thie data is also made accessible using a grafana dashboard.
      </p>

      <h2>SSH and SFTP</h2>

      <p>
          Both ssh and sftp are supported. Multiple concurrent sessions to same agent are allowed as well
          as multiple agents are also allowed.
      </p>

      <h2>timeouts</h2>

      <p>
          There is a timeout mechanism in the agent such that jobs do not hang indefinitely
          waiting for a connection. This mechanism is useful to make sure build agents do not keep
          build agents occupied for a long time. By default, the agent exits with status 0 when
          the last client exits after logging in. The timeout is an inactivity timeout. Activity is
          detected as follows:
          <ul>
          <li><b>ssh</b>: any key press is considered activity</li>
          <li><b>sftp</b>: any output from the server side is considered activity. This is done to
             make sure that longer downloads cannot be killed by a timeout. A simple <code>ls</code> command
             in an sftp session will also lead to activity since the server will output the result of the command. </li>
          </ul>
      </p>
      <p>When the user touches a .hold file, the agent keeps waiting for connections even
         after the last client logs out, taking into account the timeout. By default the agent
         exits when the last user has logged out.
      </p>

     <h2>remote shell usage</h2>

     <p>
       The agent supports a <code>--shells</code> command-line option by which a comma-separated
       list of shells can be prepended to the default search path for shells, e.g.
       <code>--shells zsh,csh,sh</code> (linux) or <code>cmd,powershell</code> for
       windows.
     </p>

     <p>
       The agent sets an <code>agentdir</code> environment variable that points to
       the directory where the agent is running.
     </p>

     <p>The user will get notifications from the agent any time something important happens such
     as the session being close to timeout.
     </p>

       <h2>other tools</h2>

           <p>Using available existing tools such as
               <a href="https://github.com/namespacelabs/breakpoint">breakpoint</a> in combination
               with a websocket tunneling tool such as
               <a href="https://github.com/erebe/wstunnel">wstunnel</a> a similar solution can be
               obtained. There are however some problems with these solutions that converge is
               trying to address:
           </p>

           <p>
           <ul>
           <li>Breakpoint uses an embedded SSH server which is a really good idea but
           uses the QUIC protocol for connecting to a rendez-vous server. The rendez-vous server then
           exposes a random port for every client. This make deployment on kubernetes really hard
           where fixed ports must be used and QUIC is also not a widely supported protocol.</li>
           <li>The problem with the random ports can be solved by using wstunnel running together
           with breakpoint server in a kubernetes pod, where wstunnel can forward traffic over an
           external websocket connection to the local random port that breakpoint server is listening on.</li>
           <li>breakpoint leaves it open on how users install the breakpoint executable (agent). </li>
           <li>Because of the hacky nature of this setup, it is very difficult for users to use
           and troubleshoot when things go wrong. </li>
           </ul>

           </p>
           Converve server addresses these issues in the following ways:
           <ul>
           <li>Use the websocket protocol both for agents and for clients, providing a fixed port and
               a supported protocol for kubernetes deployment. Websockets are also supported by
               kubernetes ingress controllers so this makes it easy to deploy on kubernetes.
               To make this work with SSH which does not natively support websockets, a proxycommand
               <code>wsproxy</code> is provided that allows SSH to connect using websockets.
           </li>
           <li>Providing online documentation where the instructions take into account the
               hostname and protocol where converge is running allowing users to cut and paste
               instructions that can be used without modification. In the usage page the users
               can even generate the correct agent startup commands and client connection commands
               based on the type of shell they are connecting to. </li>
           <li>Converge server provides out of the box downloads of required software. This makes sure
               client and server are always up to date and can be downloaded in any continuous integration
               job without having to package the required executables in an ad-hoc way.
               In addition a protocol version check is done. </li>
           <li>User-friendly error messages can be given to users in most cases when things do not work
               out because of <code>wsproxy</code>. This is an SSH proxy command that communicates with converge
               and provides additional information to the user. </li>
           <li>A live screen showing the current sessions that are running. The sessions webpage provides
           additional feedback about the running sessions. </li>
           <li>Interactivity in the user's session with notifications about timeouts and a very
           simple inactivity timeout mechanism. </li>
           <li>Possibility for the user to define the remote shell to use. </li>
           <li>Support for unix like bash shells and command prompt and powershell. </li>
           <li>Observability w.r.t. non-functionals of converge and of agent and client sessions through
               prometheus monitoring. For session monitoring, separate grafana dashboard is provided. </li>
           </ul>
           <p>
           </p>

  </div>
}


templ AboutTab() {
    @BasePage(1) {
                    @About()
    }
}