Jun 29

Reverse the ‘Verse: June 2016 Subscriber Edition

This post is a transcript of Reverse the ‘Verse: June 2016 Subscriber Edition Summary, material that is the intellectual property of Cloud Imperium Games (CIG) and it’s subsidiaries. INN is a Star Citizen fansite and is not officially affiliated with CIG, but we reprint their materials with permission as a service to the community. INN edits our transcripts for the purpose of making the various show participants easier to understand in writing. Enjoy!

Reverse the ‘Verse: June 2016 Subscriber Edition – Incomplete Transcripts

  • Welcome to the weekly Subscriber RtV! They’re having some issues, they can barely hear Disco over in Austin.
  • This RtV stream is one of the ‘subscriber’ editions, taking questions exclusively from the RSI subscriber chat. You have to be a subscriber to have access.
  • Gonna start with a rundown of who the team are and what they do.
  • Ahmed – Dev ops engineer. Takes care of day to day operation of cloud infrastructure. They work with a large team; Miles, Keegan, Andy, Nate, Peas, etc… that’s the Dev Ops team. They take care of cloud infrastructure and builds. He came from the web industry, this is his first game job. Came from a startup in Egypt, mainly supporting high-traffic startups, helping them handle scaling and such. Mainly background in scaling cloud infrastructure.
  • When he saw the job post, he thought it would provide interesting challenges as an MMO that runs in the cloud.
  • Jason – In charge of backend service architecture day to day. Identifies what game features need as far as backend services go, helps make the world work on the backend.
  • A backend service – Friend services, persistence (bringing items from game servers into database), GIM (in charge of spooling up and shutting down instances) etc… Network code is between clients and game server; when flying in Crusader, etc… Backend is more command and control between servers.
  • Tom – Works with Jason. Senior server programmer. Works on backend services, dealing with lots of command and control tools, making sure they have an idea of what’s going on in real time built into PTU and Live etc…
  • The three on the show are the primary fire fighters – first line of defense, essentially. After a patch goes out, they’re the first ones fixing things.
  • Jason started at Origin, working on Ultima and Crusaders. Started Asylum software, making boutique MMO’s. Got into building MMO’s. Also worked at Qualcomm as a server engineer, building high performance network apps
  • Tom was with Sony Playstation – Online Technology group. They built up network libraries for studios, shipped about 160 PS titles using the framework. He used to be more FPS, working on an MMO has a whole different set of problems, persistence, itemization, etc… Itemisation is the most technical part, managing all the items that can be traded, purchased, etc… And SC is a different beast as well, order of magnitude of itemisation larger than other MMO’s.
  • 2.4 started persistence, but there’s a lot more persistence coming. The first important thing is persistent itemization, then come persistent servers and worlds etc…
  • Persistence is fundamental for any MMO; like tires on a car, for MMO’s. Which is why it’s such a big deal, and why they talk about it this much, that they’re out in alpha. No MMO’s go ‘live’ before the persistence systems are in, makes Persistence a huge milestone in development.
  • Building persistence while everyone’s watching… it’s something very needed. It’s not a ‘sexy’ feature, it’s an expected one. But it takes a huge amount of work. They had to bring it in carefully; right time, make sure game design was hammered out, etc… And they still have a lot of data that has yet to be designed, but they had to get player inventory, etc… before they could begin a strategy. There’s a lot more that goes on behind the scenes, since players’ll have an average of thousands of items, rather than 100’s like other MMO’s.
  • They have to tailor the data so it doesn’t bog anything down. Server programmers have to manage the backend, for potentially 50-250,000 players, managing the data, the gameplay guys and UI team have to have visual displays in-game so changes get made and pushed to the persistent servers… lots of tiers to manage items so they have complete control of when it writes to the database.
  • They have to have different tiers because one game server may have 30 players, it gets sent to a cache system, which holds the data and then writes it to the database. When they have hundreds of servers, they have to have a way to put info in that won’t kill the database.
  • For every item in the game they can get a history of where it was, when it was traded, when it was sold, etc… lots of command and control tools in place for that.
  • [Ahmed – building an MMO in the cloud – what makes SC in cloud unique?] When the ‘cloud’ movement started, people said it’s not for everyone. Can’t just take an app and put it in the cloud. You have to use other layers that Cloud provides; object storage, movements between VM’s, firewalls, etc… and everything operates as a service. All really good stuff, but you’re relying on resources shared among other people. Have to design it all for failure, expect things to go down at any moment.
    • The ‘statelessness’ – there’s no state that might die at any moment. Everything can be accessed in different ways. It’s all about distributed systems. Having it in the web, everyone does it. Games are the complete opposite. They’re very stateful, MMO’s are extremely stateful. Leave your machine running 24 hours, you won’t tolerate a 404 or a disconnect etc…
    • Getting stability in the cloud is not easy. Takes a lot of control out of the dev’s hands. The challenges are really that MMO’s are alien in the cloud. No-one else has done what CIG does. Not a lot of people manage to have a game server in the cloud with things working, and that’s the challenge. If you want to use a new tool in the cloud, and you code in a modern language, and your use case is something to move records, you’d find an example in five minutes. Making a game, that’s different.
  • [CIG have plans in place to adress the challenges. Some we can’t talk about, some are going to be revealed by Chris, but in the broadest sense, how are we facing the challenges?] They want to implement what’s known as event-driven data centre. The more they provide float resources, ‘time to market’, how fast would it be to scale a game server. How can they re purpose resources, etc… Cloud is mostly utility computing. You pay for what you’re using.
    • They want to implement event driven centers. Everything that goes through a major event, it would go through busses that would react to it. They can orchestrate that how they want. They need to make all the logic that the gameplay guys are writing report what it needs, and the cloud infrastructure has to react to it by expanding, contracting, etc…
    • That’s the plan. Moving to that, the way they usually do it in dev ops, you can’t really draw everything out. They create the minimum viable product, get the players in the game, and upgrade from there. Move fast and break things. It’s like persistence – if they had laid out persistence and coded it all before laying it out, they’d have hit errors. Implementation is easier when you have something to test with. At one point they didn’t even have game data designed, they didn’t even know what they had to persist.
  • Disco remembers those meetings, deciding what the players get to persist, what doesn’t, etc… and all that is still evolving. There are many game systems still to come online, still to evolve.
  • [Scale of the game, scale of how SC dwarfs previous projects. Talk about the scale] Itemization. WoW, for example, you can have 100s of items on a character. Lots to keep track of. In SC, just with ships, things you buy on the site right now, you’re in the 1000s, and there will definitely be players with 50,000+ items associated with a character. Lots of thought being put into how to deal with that. If they have a million online at a time, 50,000 per player, it’s a lot to move around.
    • Before persistence, architecture was very FPS-like. They could pretty easily hit 250,000 player numbers with the code they had before, but once you add persistence, you have to change everything. Writing all services in C++. They can scale different ways, but they have to consider load balancing and availability. If any service goes down, something else has to pick it up. The service has to always be available. Multiple instances of any service up at any time.
  • They want to apply mega server methodologies. Single login, single Universe where you can play with everyone, rather than locking people into servers. Designing the mega server architecture up-front.
  • [What’s a megaserver?] One giant world. A collection of servers with services working together to give the impression of one server. Rather than something like WoW where you have 200+ servers to choose from. Makes it difficult cause they have to work with players all over the world as well, it’s a global mega server. Players have to feel like they’re there, can play with whomever, and it needs to feel like they’re next-door. No transitioning to another realm or anything.
  • And now it’s question time. Questions coming are taken from the RSI subscriber chat.
  • [Ever had a typo break everything?] The way they deploy environment is ‘green load deployments[?]’ They stage a shadow environment, then switch the environments when they’re ready. It was around 2.2 or something, and they were breaking some rules cause they had to go live. With DevOps, they deliver what the community needs as quick as possible. Ahmed had an extra space in a URL that would generate lots of stuff going to servers, and it caused an issue with authentication. Caused login errors, QA couldn’t get in.
    • It wasn’t production though, so they were able to catch it, but it was hard to find the extra space.
  • [Do you use common framework like springs or struts on the backend?] That’s Java framework? Universe cluster is primarily C++ with some 3rd party libraries. Web does lots of java stuff though.
  • [What config management tools are used? Log management analysis tools?] They use Chef, and a bunch of other homebrewed scripts. They have a graph to show explaining some of it. There’s a graph now on screen.
    • Album of the graph
    • Current build is based on buildbot?
    • This has gotten extremely difficult to explain.
    • Incredible amount of detailed explanation here, about how the server stuff works. If you’re interested, watch the video at ~30 minutes. I simply cannot summarize this.
    • Any cloud environment in CIG has three VM’s, one is the hub, one’s the game server, etc… None of it is final though. They’ll be doing a lot of work on it.
    • There are lots of different kinds of servers, they have stag drivers, analysis servers, they run Chef… What they’re showing is not final. They’re working on a new deployment pipeline.
  • And that explanation is done. Time to get back to summarizing.
  • [Patch reduction – What insight on progress of patch size reduction?] That project is running on lots of caffeine. It involves people from all different departments; Dev Ops, IT, Engine programmers, etc… Everyone’s working to create a new batch delivery system. Affects players and developers. When they get it done, it’ll help devs as well, cause devs have to pull new builds all the time as well.
    • They feel really bad when they deliver three builds in a day, they want to give a seamless experience. Having a smaller batch is key for everyone. It’s a challenge because they’re operating a live environment at a very early point in the game. Mike Pickett in Austin is working on the issue. He’s one of the first that was working on it, but it extends to lots of people. The system has to be native; every part needs to use it.
  • There’s a huge team involved in SC, Disco likes to give shoutouts to people as much as he can, so Mike, Good Work!
  • Wooo Thanks Mike! Keep it up!
  • [Ahmed, why are you so amazing?] Glasses.
  • [What colour network cable is your nemesis?] Dealing with stuff you can touch with your hands stopped years ago. They forget about colours of stuff now, because of the cloud. Now they just tell stuff to happen and it happens. Not always the right way, that’s their job, but Ahmed hasn’t touched a network cable in a while.
    • Disco ran his own IT company for a while, always got interns to wire the cables.
  • [In some MMO’s, bot armies set up to grind currency is an issue. What’s being done to prevent that?] They don’t answer those questions cause they don’t want the bad guys to know what they’re doing. They know we want to know how we’re gonna do it, but the more info they put out, the more info the bad guys have to circumvent. Jason has lots of tricks up his sleeve though. They’re watching, they know what they’re going to try to do… they’ll catch ’em.
  • [Client Framerate vs Server Framerate?] Client framerate is the rate at which your graphics are refreshing. On server side, it’s the simulation rate. They’re not locked together. Lots of players are concerned about FPS. First off, thanks to everyone that plays PTU, and to all the Evocati. It’s amazing. What they try to do on PTU is to define points that should be addressed. Optimization is a rabbit hole. They have a product that’s out, that they want to make playable and fun. They try to point things to engineers in Frankfurt etc… to try to make it okay, but they have a lot that they aren’t delivering on, because it’s not the time for it. It’s liveable right now, that’s important. They do have tricks up their sleeves for later though.
    • People have on their radar what’s most expensive on the game server and on the client side. But there are so many features going in, it’s premature to optimize them. Once you optimize code, it looks different. Wide range of things they can optimize. Rendering, client side simulation, rendering. On the backend, there’s physics. Also game servers rely on response from backend, etc… They optimize certain things over time, but there are so many departments that’ll optimize later. If they optimize now, they might pin themselves into a hole later.
  • The team is finite. When they know something is temporary, meant to hold over to the new system, you don’t want to spend time optimizing it.
  • [Will site be hosted form the same servers as the game?] Website and backend are about to have a very close relationship. Prior it’s always been services and platform separate, but now they’re coming together. They’re getting married. They’ll have dedicated channels for bi-directional communication. Working on how to architect both sides to make data and responses as easy as possible. When you buy a ship on the site, there’s no way for platform to tell that. They grab your package, and then they know about it, and they can create the ingame items and persist them. With the new system, the moment you buy a ship, they’ll get notified, and it’ll be in-game.
  • [What hypervisor do you use in the cloud, and what core OS do you sue?] GCE cloud engine, custom KVM. The hypervisor has different ways to extract instructions, but there shouldn’t be an issue whether they’re relying on XEN, KVM, or anything else. It won’t be the main issue. Currently they use Ubuntu 14.04 as the core OS that runs the game, but the stream just cut out, and I don’t know the rest.
  • Stream is dead, Twitch just died.
  • Not sure what’s going on. Stream hasn’t come back. might just be dead?
  • Disco also left the ‘subscriber chat’, so it’s possible they lost connection entirely.
  • According to Zyloh, LA lost internet.
  • And they’re back.
  • There was a computer crash, they’re back just to say thanks for watching, for supporting, and for submitting questions.
  • There will be a stress-test at 3pm, so in… 3 minutes or so. Stress test!
  • They’re happy to be here putting together something amazing, and giving a great gaming experience. Tom’s hoping PTU goes well so he can sleep tonight.
  • PTU stress text happens in 3 minutes, stress for 2.4.1
  • If you’re on PTU, hop on, head into the SC Discord, help out!

About the Author:

Leave a Reply

*