Click Technology

Linux, Windows, Mac it's all good

Nice ChatGPT use case..

March23

Saw this, thought of you.

https://www.chatpdf.com/

Upload your PDF to ChatPDF, e.g. “Honda GCV 530 Lawnmower Engine Users Manual”

Ask it a question, e.g. “The engine won’t start”

It replies with the list of possible options and references. Nice for cutting down corporate documents too to just the bare minimum for comprehension/overview purposes.

Ryan Gosling clapping as an animated gif
posted under Linux Tips | Comments Off on Nice ChatGPT use case..

AWS Cost reckoning tips

February3

Welp, it’s FAANG redundancy time and so companies are looking to save a few quid.  Here’s a couple of cheap and easy fixes that should save enough to pay for half your devops team.

Correct sizing of EC2 instances

This is so often overlooked by DevOps when it comes to saving money.  Over/under spec’d devices cost extra cash as well as leaving them switched on when they’re not being used is bad news for the company’s wallet.

To size instances correctly, check out my favourite site https://ec2instances.info for a comprehensive search/filter site for all EC2 instances.  They’re on GitHub too so give them a like if you get time. Then, head over to AWS Compute Optimizer which learns what your instances do and then tells you (after about a month) whether they’re too big or too small. Resize accordingly.

Your choices may be limited by things like needing to use SSDs which are still ephemeral, but you should be able to use the CO to make rational decisions about resource allocation.

Once you’re happy the performance is right and the size is correct, it’s time to start locking in the value for the company.

Get Reserved Instances

Next, ffs buy Reserved Instances.  Don’t pay the On Demand sticker price for EC2 instances; that’s for amateurs.  Reserved Instances are precisely that.  You rent servers and pay up front partially or completely for 1 to 3 years to reduce cost.

Always buy RIs if you have a fairly reasonable idea of future demand.  You don’t have to be a clairvoyant either; you can always start by buying in one year contracts and that’s pretty standard for business.  I used to live in Germany and there, everything is done in one year chunks, so estimate your usage and commit to buying Reserved Instances to meet your needs, thus saving real costs and real money.  Again, you are just bleeding out costs unnecessarily without RIs.  If you can commit to 3 years, and that’s not that improbable, you can save a whopping 70% on your costs.

Let’s table up some real life costs then for some samples on an annual basis..

TypevCPURAMOn DemandSpot1 Yr Reserved3 Yr Reserved-%
t3.micro21$91.10$28.91$56.94$39.4256.7
r5a.2xlarge864$3,959.52$1,841.35$2,496.60$1,708.2056.9
r5.16xlarge64512$35,320$13,714$22,250$15,25956.8

Develop your software for spot instances

Now you can see the difference in costs, it’s time to start breaking out the technologies that allow you to use swarm technology, like Docker, Kubernetes and so on so you can run software on goal oriented clusters that can take advantage of spot instances. Spot instances are basically spare instances that are not being used. Like a last-minute holiday, or buying flowers before the market closes, means you get a significant reduction in price and AWS get some $$$ for their otherwise unused instance.

Of course, not everything can be swarmed and bioinformatics is no different, so it applies more to the nodejs / Angular / Svelte / React space and modern containerized apps, but if you can, the spot pricing might be the option for you. It is by far the cheapest option and lowest commitment. On the downside, depending on what you’re using, if esoteric, may be more difficult to spin up under times of heavy consumer demand.

Convert all your Buckets to S3 Intelligent Tiering.

You use S3 Buckets? Great. It’s perfect for petabytes of data and is bigger than all the storage you’ll ever own, that’s for sure. However, you absolutely MUST switch the storage mode to Intelligent Tiering to really save a fortune.

Typically, people tend not to configure the S3 storage so it is left in Standard mode and costs about $0.022 per GB. That’s not that much of course until you put terabytes of data onto S3 and pay full price for it.

With about say 700TB on S3 in standard mode, you’ll be spending in the region of $15,000USD on storage a month. That’s 100% unsustainable and if you’re also paying for support at about $4k and you’re not being warned about it each and every day, you’re an absolute mug.

So, using the pricing calculator ( https://calculator.aws ), let’s do the maths for this model..

S3 Standard storage

700 TB per month x 1024 GB in a TB = 716800 GB per month
CapacityNotePrice in USDTotal
51,200 GBfirst 50TB$0.024$1,228.80
460,800 GBsubsequent 450TB$0.023$10,598.40
204,800 GBthen.. 200TB$0.022$4,505.60
$16,332.80 / month
UK VAT @ 20% $3,266.56
Total$19,599.36 or £16,254.33 / month
Pricing per month for S3 storage in standard mode.

So what realistically can be achieved?

Well, based on my own experience, 60% reduction in cost is perfectly possible and probably up to 70% so that would be a drop from $12,716.39 a month to $5,093.04 for around 700TB. Nearly 60% savings or $111k USD in a year!

Some notes..

You will, as always, need to know your environment well and use the pricing calculator carefully to understand the costs. You will also need to have your average object size info and other details to get the analysis as close to real-world conditions as possible. One useful question I was often asked was “Will this mean if we have to recover data, we get charged for it if it’s in the Infrequent Access Tier?” No. That would only apply if you had moved your data out of S3 Intelligent Tiering into one of the S3 Glacier Tiers, namely S3 Glacier and S3 Glacier Deep Archive. Then, charges apply for recovery of data from the archive.

The HOWTO is here and you can get all your buckets converted in one go using this AWS Article and using the Python script therein, which is here in the AWS repo. Naturally, test first as I did and see how it looks. Worked fine and scaled fine. For my case I used Ansible just as easily.

You may well be interested to know if there is a performance hit? No. None. Nobody will notice anything. Run script, everything keeps working as normal.

The only disadvantage is that it will take time to transition everything. If you run the code today, day 0, you’ll wait 30 days for unused data to transition from Frequent Access Tier to Infrequent Access Tier. Then you have to wait until day 90 for the data to move from Infrequent Access to Archive Instance Access, so there will be a delay. That said, you can expect cost to reduce step-wise by about 25% in the 1st month, then to 45% in the second to 60%+ after the 3rd month.

So, if you start say in August, you should feel the full effect of your action by Christmas, so the sooner you start, the sooner you save. It’s the lowest risk, least political, most effective move you can make at work and will realize really large savings to the business. That is perhaps the most important part of using cloud tech. You absolutely must work with the mindset that all costs are going on your own personal credit card and you will have to pay for it next month.

Articles about this often mention that there is a ‘small charge‘ for setting up Intelligent Tiering as AWS has to inventory all your files and prepare a list of all of them that are over 128k in size. How small is small? Are we talking 5/10k small? Couple of hundred bucks? What? The answer is about $25USD for 700GB with tens of millions of objects, so no need to lose sleep about these initial charges. They will be recouped almost immediately.

Clap! Clap! Clap!
CFO looking at me (off camera) cutting him a cheque for 110 grand.

posted under AWS, Linux Tips | Comments Off on AWS Cost reckoning tips

Neat AWS feature

January19

So I’m studying for my Amazon Web Services Associate Architect ticket at the moment and, in the coursework I’m doing, I came across this really neat feature in AWS.

If you’re on an  EC2 instance (a VM), and you do..

 

$ curl http://169.254.169.254/latest/meta-data

 

You get a whole load of metadata back, like this..

 

ami-id
ami-launch-index
ami-manifest-path
block-device-mapping/
events/
hostname
iam/
identity-credentials/
instance-action
instance-id
instance-life-cycle
instance-type
ipv6
local-hostname
local-ipv4
mac
managed-ssh-keys/
metrics/
network/
placement/
profile
public-hostname
public-ipv4
reservation-id
security-groups
services/

 

All you need to do now is pick any of these sub-topics and query it again…

 

curl http://169.254.169.254/latest/meta-data/local-hostname

 

and you get

 

ip-10-16-57-144.ec2.internal

 

or query

 

curl http://169.254.169.254/latest/meta-data/instance-type

 

ans you get

 

t2.micro

 

Nice huh?  As the feature is a URL, then you can be sure you can query it from a Flask or NodeJS app directly so an app can now be aware of what kind of hardware it’s running on, so you could even have an app that would report in telemetry to a central server to let developers know about the characteristics of the host in relation to the app performance.  Quite a neat piece of internal architecture.

"Nodding Guy" Meme: Robert Redford

Nice.


      
posted under General, Linux Tips | Comments Off on Neat AWS feature

Fallacies of distributed computing

December4

This is something I’ve always know about in the back of my mind when planning systems.  Chaining critical path dependencies can be problematic if there’s a fault somewhere and, if it can go wrong at exactly the wrong time, it probably will.  I’m always a fan of having a Plan B because there’s no better feeling than being able to jettison a work stream and move to a more direct approach.

Funny enough, I wandered right into this concept just now and thought I’d post something about it because it has actually been essentially postulated already.  Ladies and gentlemen, I present to you, the “Fallacies of distributed computing“.

They are…

  • The network is reliable
  • Latency is zero
  • Bandwidth is infinite
  • The network is secure
  • Topology doesn’t change
  • There is one administrator
  • Transport cost is zero
  • The network is homogeneous

In many ways, the wider concept is that, as humans, our understanding of pretty much everything is wrong.  Why?  Well, nobody can have 100% of the information on a subject matter at any one time, whether that’s prices, markets, commodities, people or their intentions.  As any part of that information could prove critical, it’s absence means you are de facto misinformed.

You’re potentially the victim of misinformation depending on what you then do based on the (mis)information given.

I think this is the reason I’ve always been a fan of working with what I would call conscientious objectors in a team.  People who will always argue the contrarian point.  It’s a klaxon that can help people wake from their tendency to group-think and over-comply.  Getting something done is not just always about repeating what we did the last time.  Great as a template, sure, but this time around?  As any environment changes, real-world or digital, so too do the possible new tools available, newer methods, better people, and the realisation that our thinking has been way too small from the start.

So, bottom line?  It pays to remind yourself how little you know.

posted under General | Comments Off on Fallacies of distributed computing

Publish localhost website to the internet..

November3

Need to show somebody your development website you’re running on localhost your desktop?

You could push the project to AWS/Digital Ocean or some other hoster etc and start it and that’s cool, but what about a one-liner to just allow the site to be seen by a colleague on the internet?

ngrok has now entered the chat.

How to use?

Log in with your github account or create an account.

Download the Linux client. Unzip the client to /usr/local/bin and mark it executable.

sudo chmod +x /usr/local/bin/ngrok

Now click the ‘Getting started / Your AuthToken’ menu on the ngrok website and run the authtoken commandline which looks something like this..

$ ngrok authtoken 1zdxnDpvAvPpqWrpNL6zIazyzIrQQWhqfKNRbF48bW123

OK, now you’re ready. Start your localhost website, e.g.

$ ./node_modules/.bin/vue-cli-service serve &

Now publish the site..

$ ngrok http 8080

If the site fails to publish because header is bad, just use the -host-header option, like this..

$ ngrok http 8080 -host-header="localhost:8080"

Once started you get the output below and you can copy and paste the URL to your messaging app / Skype / FB Chat etc., and voila.

Output..

ngrok by @inconshreveable (Ctrl+C to quit)

Session Status online
Account clicktechnology (Plan: Free)
Version 2.3.40
Region United States (us)
Web Interface http://127.0.0.1:4040
Forwarding http://84de-213-152-186-35.ngrok.io -> http://localhost:8080
Forwarding https://84de-213-152-186-35.ngrok.io -> http://localhost:8080

Now message the URL to your Slack / MM whatever and watch people connect to your site.

 

Nice, huh?

via GIPHY

posted under Linux Tips | Comments Off on Publish localhost website to the internet..

Long story short

October26

I was trying to configure a FreeRADIUS server and, as is the tendency in Linux, devs confuse documentation for configuration. I’d much prefer a .conf file without the entire documentation embedded in it.. I’d rather a README in the root, a /docs or a man file instead.

Anyhoo, how to remove all the garbage and just get the .conf file commands only?

$ sed '/^$/d; /#/d' radiusd.conf.original > radiusd.conf

This removes all blank lines and all lines beginning with ‘#’ and now you have a nice compact .conf file. File goes from ~27k to 1.3k.

Noice.

posted under Linux Tips | Comments Off on Long story short
« Older Entries

This is my website for short blog posts and interesting materials that are noteworthy or have some handy-tip value.