Taco Steemershttps://tacosteemers.com/2024-01-01T00:00:00+01:00A personal blog.Consider backing up build dependencies2024-01-01T00:00:00+01:002024-01-01T00:00:00+01:00Taco Steemerstag:tacosteemers.com,2024-01-01:/articles/consider_backing_up_build_dependencies.html<p>Software build dependencies are a risk in general. A specific risk less often discussed is that libraries and packages might become unavailable.</p><h3 id="the-problem">The problem</h3>
<p>Build dependencies, alternatively called packages or libraries, can disappear from dependency servers.
Dependency servers can go offline, even if only temporarily.
This can leave projects in a state where they can not build.</p>
<h3 id="the-corporate-solution">The corporate solution</h3>
<p>The easiest way for companies to solve this is to use their own dependency server and configure their build processes to pull dependencies from and publish dependencies to their private dependency server.
That dependency server can then serve as a backup for the public dependency servers.
It also becomes easier to use dependencies that were developed in-house.
Example solutions are <a href="https://jfrog.com/artifactory/">Artifactory</a> and <a href="https://aws.amazon.com/codeartifact/">AWS CodeArtifact</a>.</p>
<h3 id="other-things-to-consider">Other things to consider</h3>
<p>Source code and documentation for libraries may also be considered important artifacts for the development process, depending on your specific situation. Documentation may disappear from the web and source code that used to be available might become closed off. I have also seen cases where applications and libraries that we started using many years back were sold, renamed or rewritten. As a result online documentation and sources were too recent for us, as well as difficult to find because names and websites had changed.
Consider making backups of documentation and source code for libraries, and installers and documentation for software. </p>
<h3 id="a-hobby-project-solution">A hobby project solution</h3>
<p>A less ideal solution is storing the dependencies offline.
To be able to use backups of build dependencies we would have to set the build system to an offline mode where we point it to a directory.
A better way might be to use the build system's own caching system and copy them there.
That is easy to do with Java's Maven build system, but may need configuration and could be more difficult for other build systems <a href="https://docs.gradle.org/current/userguide/build_cache.html">such as Gradle</a>.</p>
<p>One thing to consider is if one wants to make backups of entire build system cache directories, or just backup dependencies used in individual project's build steps.</p>
<p>For a hobby project written in Java I am trying out the Gradle task below. It should run on every build.
It copies all dependencies that the build system, Gradle, can find.
The copied files can be added to version control, or to some kind of backup system if that is preferred.</p>
<p>I am considering changing that code to use the information resolved by Gradle to look up the matching entries in the Gradle cache directory and backing up those directories instead. That way it would be easy to insert them back in to the cache and let Gradle use them.</p>
<div class="highlight"><pre><span></span><code><span class="err">task backupResolvedLibs() {</span>
<span class="err"> doFirst {</span>
<span class="err"> var outDirPath = Paths.get(project.projectDir.toPath().toAbsolutePath().toString(), </span>
<span class="err"> 'build_dependencies'+File.separator)</span>
<span class="err"> String outDir = outDirPath.toString()</span>
<span class="err"> Files.createDirectories(outDirPath)</span>
<span class="err"> configurations.resolvableImpl.resolvedConfiguration.resolvedArtifacts.each {</span>
<span class="err"> var source = it.getFile().toPath().toAbsolutePath()</span>
<span class="err"> var target = Paths.get(outDir, it.getFile().getName()).toAbsolutePath()</span>
<span class="err"> println source</span>
<span class="err"> println target</span>
<span class="err"> if (it.getFile().getName().contains("SNAPSHOT") || !Files.exists(target)) {</span>
<span class="err"> Files.copy(source, target, StandardCopyOption.REPLACE_EXISTING)</span>
<span class="err"> }</span>
<span class="err"> }</span>
<span class="err"> }</span>
<span class="err">}</span>
</code></pre></div>
<p>We need to make sure that the task runs regularly.
In Gradle that means adding it to another task, a commonly run task.
It also needs to be run after running the build step, to make sure that the dependencies have actually been resolved.
The lines we need to add to that Gradle task look something like this:</p>
<div class="highlight"><pre><span></span><code><span class="err">dependsOn 'clean'</span>
<span class="err">dependsOn 'build'</span>
<span class="err">dependsOn 'backupResolvedLibs'</span>
<span class="err">tasks.findByName('build').mustRunAfter 'clean'</span>
<span class="err">tasks.findByName('backupResolvedLibs').mustRunAfter 'build'</span>
</code></pre></div>
<p>One nice way to add things to build steps is to use a <a href="https://gist.github.com/maxisam/e39efe89455d26b75999418bf60cf56c">shell script that fails when any step fails</a>. The shell script can call the 'backupResolvedLibs' Gradle task after the build has succeeded.</p>Factors in deciding which jlink compression level to use2023-03-11T01:00:00+01:002023-03-11T01:00:00+01:00Taco Steemerstag:tacosteemers.com,2023-03-11:/articles/java_jlink_compression_levels.html<p>Here are some ideas that may help knowledge workers in their approach to assignments or projects.</p><p><a href="https://docs.oracle.com/en/java/javase/11/tools/jlink.html">jlink is the Java equivalent of a linker</a>. It can be used to bundle your code and dependencies with a Java Runtime Environment.</p>
<p>With the use of jlink it becomes easier to deploy an application to a customer because we don't need the customer to manually install the correct Java runtime environment.
jlink comes with a few options for reducing the output size.</p>
<p>Here we look at the --compress option.
It has three options: 0: no compression, 1: constant string sharing and 2: ZIP.
I think the numbers indicate how much time they cost during the execution of jlink, with option 2 taking the most time.</p>
<p><a href="https://bugs.openjdk.org/browse/JDK-8157991">According to this bug report</a> the ZIP option can lead to a 13 millisecond slower startup time on a Hello World program. Personally I am okay with small startup delays for the graphical desktop application I am working on.</p>
<p>What concerns me more is installer size and size on disk. I will show an example based on a private project I am working on.
Here I used the excellent InnoSetup for creating the installer, but the general idea will apply to any installer or packaging system that applies compression to reduce the total package size.</p>
<table>
<thead>
<tr>
<th>jlink compression level</th>
<th>InnoSetup Installer size</th>
<th>Size on disk</th>
<th>Size on disk compared to installer size</th>
</tr>
</thead>
<tbody>
<tr>
<td>Level 0: No compression</td>
<td>21.864.448 bytes</td>
<td>97.210.368 bytes</td>
<td>~4.5 times increase</td>
</tr>
<tr>
<td>Level 1: Constant string sharing</td>
<td>21.835.776 bytes</td>
<td>77.504.512 bytes</td>
<td>~3.6 times increase</td>
</tr>
<tr>
<td>Level 2: ZIP</td>
<td>36.855.808 bytes</td>
<td>61.747.200 bytes</td>
<td>~1.7 times increase</td>
</tr>
</tbody>
</table>
<p>As we can see the installer size will be bigger when we use the most compression.
I believe that this is because the components that have been compressed by jlink cannot be compressed much further and might reduce the potential for compression among the other files.
It can also be that InnoSetup uses a more efficient compression method.</p>
<h1 id="conclusion">Conclusion</h1>
<p>To conclude, there are four factors we can take in to account when deciding which jlink compression level we want to use.</p>
<ul>
<li>Package or installer size. Use level 1 if you want a smaller package size. It will still benefit the size on disk after installation.</li>
<li>Size on disk. Use level 2 to get the smallest disk usage after installation. Note that more bytes will need to stored and transferred by the software distribution system.</li>
<li>jlink execution speed. Choose level 0 if you want the fastest jlink execution speed.</li>
<li>Application startup time. Choose level 0 if you want the fastest application startup.</li>
</ul>Succeeding with work assignments2022-07-19T00:00:00+02:002022-07-19T00:00:00+02:00Taco Steemerstag:tacosteemers.com,2022-07-19:/articles/succeeding_with_work_assignments.html<p>Here are some ideas that may help knowledge workers in their approach to assignments or projects.</p><div class="toc"><span class="toctitle">Table of contents:</span><ul>
<li><a href="#refinement-planning-ahead">Refinement: planning ahead</a></li>
<li><a href="#does-the-assignment-fit-our-current-capabilities">Does the assignment fit our current capabilities?</a></li>
<li><a href="#lets-get-to-work">Let's get to work</a></li>
<li><a href="#keep-refining-working-on-tasks-and-reviewing">Keep refining, working on tasks and reviewing</a></li>
<li><a href="#getting-un-stuck">Getting un-stuck</a></li>
<li><a href="#keep-stakeholders-informed">Keep stakeholders informed</a></li>
<li><a href="#manage-stakeholders-expectations">Manage stakeholders' expectations</a></li>
<li><a href="#external-dependencies">External dependencies</a></li>
<li><a href="#storage-and-delivery-of-our-work">Storage and delivery of our work</a></li>
<li><a href="#consider-the-projects-time-budget">Consider the project's (time) budget</a></li>
<li><a href="#change-requests">Change requests</a></li>
<li><a href="#some-tasks-are-never-finished">Some tasks are never finished</a></li>
</ul>
</div>
<p>Taking action and solving problems is fun! But we can't get started taking action right away.
We want the end result to be satisfactory, and we are unlikely to get a satisfying result if
we jump right in.</p>
<p>Here I present some ideas that may help us to successfully complete our assignments.
<em>These are just suggestions and may not work well in your specific situation.</em></p>
<p>Usually an assignment has at least a refinement phase and an execution phase.
In both phases the general idea is to take the next
question or action that comes up and get started on that.
Finally, there is a review phase, where we review our work or have it reviewed by others.
When we are working on assignments with sub-tasks, like a whole project, we will keep going from
refinement, to execution, to review, delivery and so on.</p>
<p>We will spend a lot of our time communicating with people.
Asking for input, clarification on that input, keeping stakeholders informed
and soliciting feedback on our work.</p>
<p>A good first question to ask may be if we are really supposed to do work for the person who gave us the assignment.
We don't want to do good work but end up getting reprimanded for it by our manager.</p>
<p>Thinking things over will do wonders for the end result,
but only up to a point. Thinking things over is not the same
as completing the assignment. Completing something requires
taking action. Each action we take will help clarify the
assignment. We need to keep evaluating if we need a moment to investigate,
or if we should take action based on we know at this moment.</p>
<h2 id="refinement-planning-ahead">Refinement: planning ahead</h2>
<p>First we must make sure there is a written description that we
can use as a basis. If it does not yet exist we will make one.
The reason we need a written description is that
details and nuance matter. Without a description to fall back on
and update over time, we will forget or fail to think of
important details. It will not only help to clarify what should
get done; it will clarify what did and what did not get done
over the course of the assignment. We may need to talk to
several people to get a complete enough picture. </p>
<p>It is important that the description is clear about what is expected from us, and what isn't.
What does an acceptable result or delivery look like?
Are we expected to provide support to anyone after we have finished this assignment?
Whoever asked for the delivery might expect you to be available for questions and
additional work afterwards. Is there a set number of hours listed in the contract?
We need to take this type of information in to account when we plan our work. </p>
<p>If the assignment doesn't meet our standards, we should refine
it first. Find out what is missing from the story.</p>
<p>Let us look at the description of the assignment. In it, we find
sub-tasks and more things we should look at.
Maybe there are unfamiliar terms and acronyms or there is a
lack of information in a specific area.
Make note of these as we read through. Each loose end must be
written down to allow us to look in to them at the right
time, that way we hope to avoid wasting our time now and in the
future. We write down new sub-tasks for these loose
ends.</p>
<p>An assignment and each of its sub-tasks:</p>
<ul>
<li>Should be specific as to the desired outcome versus the
current situation.</li>
<li>Should have a short but accurate title. This makes it easier
to talk about the assignment and avoid misunderstandings.</li>
<li>Should be very clear on what problems we are and are not
solving here.</li>
<li>Should mention related assignments/tasks to allow for better
decisions while working on this one. </li>
<li>Should be in some way time-boxed, though it may slide out of that time box. To time-box something means to indicate roughly how many hours is okay to spend on something. </li>
<li>Should mention other topics that might be affected and might
need our attention.</li>
<li>Should mention alternative solutions or workarounds, or a lack
of them. Workarounds can be vital in determining our priorities.</li>
<li>Preferably can be explained easily. This may be hard to
achieve but is worth the effort.</li>
</ul>
<p>When we are done refining we may need to remove unnecessary information. We do this
last; if we do this early we don't know enough to decide
which parts are not relevant.</p>
<h2 id="does-the-assignment-fit-our-current-capabilities">Does the assignment fit our current capabilities?</h2>
<p>We should also pause to consider that an assignment should be
assigned to someone that might have a good chance
of completing it to everyone's satisfaction, given the resources
and time available. Is that the case here? If not, waste no time
and discuss this with the relevant people. </p>
<p>Broadly speaking we can aim for one of these three outcomes:</p>
<ul>
<li>Expand the available resources, such as additional assistance
becoming available to us.</li>
<li>Reduce what is being asked of us in the given time frame. For
example, the assignment can be split up and assigned among more
people or requirements can be dropped.</li>
<li>Get someone else to pick up the assignment.</li>
</ul>
<h2 id="lets-get-to-work">Let's get to work</h2>
<p>Some sub-tasks can be assigned a block of 25 minutes
(<a href="https://en.wikipedia.org/wiki/Pomodoro_Technique">like the Pomodoro technique</a>).
Others are tasks that have external dependencies. An example of
this is when we put in a request for access or information
somewhere, and after that we just
have to wait it out. These can be picked up in-between other
tasks. Don't forget to add a task to remind ourselves of
these open tasks. In all likelihood we will need to follow up
on these several times before they are resolved.
While waiting, we may be able to pick up other tasks.
If not, we might look at relevant standards and
documents, as well as existing work we have in-house.
Make notes of things that might be relevant.
It would be great if we could use this waiting time to refine
the assignment description, and it's sub-tasks.</p>
<p>Try to stay in the flow of picking up and completing tasks.
Keeping it up is the easiest way to get through.
There may be delays due to people reviewing our work or because we need to get access to external systems.
That is all part of the job. We don't let it get to us.</p>
<h2 id="keep-refining-working-on-tasks-and-reviewing">Keep refining, working on tasks and reviewing</h2>
<p>While working on completing our tasks, we keep making notes of any possible
ways to split up bigger tasks, or ways to clear up large or vague tasks,
and any open questions and loose ends that need to be investigated.
We keep switching between refining, taking action and reviewing.
We keep refining our backlog of open tasks and evaluating our finished work, preferably with our stakeholders.
New insights may lead us to re-work things that we thought were already finished.
We keep iterating on our work, building it up to an acceptable delivery.</p>
<p>These ideas are part of what some people call the "agile" way of working.
Instead of planning everything ahead and strictly sticking to the original plan,
we accept that requirements can change.
As long as the changes fit within the timeframe and budget, more on that later.</p>
<p>We may feel that there could be undesirable interactions
between the current work and existing activities. Write
these thoughts down and follow them up in a timely manner.
Don't pick them up when it feels too late already. Here I don't
necessarily mean too late in the day, I mean to pick
things up soon to make sure they don't become a problem. </p>
<p>In a general sense small open tasks are probably best
to handle as soon as possible. Done is done.
After that we can get back in to the flow,
working on the bigger parts of the assignment
where we have less context switching.</p>
<h2 id="getting-un-stuck">Getting un-stuck</h2>
<p>Everyone gets stuck at some point, and we are no different.
It is good to try to recognize if we are stuck, before we have lost too much time and energy.
When we get stuck we can ask for help, but we can also take a walk or try again tomorrow.
Whichever feels suitable. We might realize how we can get un-stuck while trying to
explain the situation, while taking our walk or while unwinding at home.</p>
<p>In programming circles there is the idea of "rubber duck debugging".
The solution to our problem often comes to us while explaining our problem
to someone else, and we might as well explain it to a rubber duck. That way we don't bother anyone.</p>
<h2 id="keep-stakeholders-informed">Keep stakeholders informed</h2>
<p>The people who depend on our work will need to be kept informed.
If we don't keep them informed they may get anxious.
Try to find out who needs to be informed, and how often they want to be informed.
Perhaps one person wants to have a short chat about our progress on a near daily basis,
but another person prefers a more formal bi-weekly meeting.</p>
<p>We also keep the stakeholders informed of any risks we may see coming up.</p>
<h2 id="manage-stakeholders-expectations">Manage stakeholders' expectations</h2>
<p>We should at all times try to manage expectations.
Maybe we have some early results that we are happy to show off.
Those early results might give people the idea that we are almost there,
even when in reality we are just exploring something that will become a dead end.
Getting feedback is great, but <a href="/articles/2020-10-25-on-clarifying-the-status-of-demoed-products.html">don't let stakeholders feel that they are looking at something that is nearly finished</a>.</p>
<p>Some people say that it is better to "under-promise and over-deliver".
What they mean is that it is better to promise less than we expect to deliver.
That way we may either end up delivering what we promised, or delivering more than we promised.
That would be better than delivering less than what we promised. </p>
<h2 id="external-dependencies">External dependencies</h2>
<p>Sometimes the success of our assignment is dependent on things that are out of our control.
In that case we need to make sure that these external dependencies are making progress as well.
We don't want to get stuck with our work because other people did not start their part yet.</p>
<p>Potentially problematic external dependencies are one of the types of risks that stakeholders need to be informed about.</p>
<p>We also keep track of people's availability. We ask them if they are available on the days when we need them. </p>
<h2 id="storage-and-delivery-of-our-work">Storage and delivery of our work</h2>
<p>Some questions to ask ourselves are:</p>
<ul>
<li>In what format are we expected to deliver our work?</li>
<li>Are we expected to use specific technologies or formats?</li>
<li>How can we keep it accessible to stakeholders while the work is in progress?</li>
<li>How can we export our work from the application we are working in?</li>
<li>Is our work being backed-up? We don't want to lose it all due to a computer failure or lost laptop.</li>
</ul>
<h2 id="consider-the-projects-time-budget">Consider the project's (time) budget</h2>
<p>Perhaps we are working on a project with a fixed budget.
As a junior employee we are unlikely to have to worry about budgeting.
Even so, it is good to know what the status of the project is (are we running out of money or time?) and who is paying for it.</p>
<p>Which budget is this work being paid from? Or is it billed directly to a client?
We need to ask the relevant people how many hours we could spend on this at most.
We need to plan accordingly, to give us a good chance of ending up with
a good deliverable before we run out of budget.
Note that we are not asking them how long they think the assignment
should take.</p>
<p>We might be asked for a perfect deliverable. If our budget doesn't seem to allow for that,
we may need to opt for a merely acceptable deliverable instead.
Be sure to discuss this situation with the appropriate stakeholders.
This may only become clear when work has already been started.</p>
<p>Sometimes we can avoid some work by spending money.
It is easier to take advantage of that if we already know who to ask for permission
and how to expense these costs to our organization or client.
Any expense is probably coming out of the same budget as our working hours,
and will decrease how many hours we can work on this assignment.</p>
<p>Finally, what are the rules for tracking the time or money spent on this assignment?
Do we need to use a special code for our timesheet application?</p>
<h2 id="change-requests">Change requests</h2>
<p>What do we do if we are asked to make changes late in the assignment?
These changes and additional changes resulting from them may not fit in the budget.
We should discuss the risk of failure as soon as possible,
and get written approval and acknowledgement of the increased risk of failure
before we get started on any changes.</p>
<h2 id="some-tasks-are-never-finished">Some tasks are never finished</h2>
<p>Be aware that it is possible that people will keep asking us about the work we did for years to come.
Perhaps our contribution makes us look like we are responsible for whatever related problems come up.
New insights and new requests may come up, and our name is the first that people think of.
If we don't have time for this additional work it is best to discuss that with our manager.
Note that when requests for work are coming in from people other than our manager we may need to decline
and refer these people to our manager. If we don't we may risk getting overloaded with work and deadlines.</p>In case you have doubts about putting your (hobby) stuff out there2022-01-14T00:00:00+01:002022-01-14T00:00:00+01:00Taco Steemerstag:tacosteemers.com,2022-01-14:/articles/hobby_stuff_doubts.html<p>There is going to be someone out there who is a good fit for what you made. They will like it, and even though you might not know they exist, and they forget about you the moment a notification comes in on their phone, something good was accomplished there.</p><p><em><a href="https://www.youtube.com/watch?v=cxSg8vyJEcg">A video version can be found here</a></em></p>
<p>Maybe you have wondered if you want to put things online, out in the public.
Like a personal website, a blog, or videos.
Maybe you wonder about what people think.</p>
<p>I did at some point have doubts.
I have a personal website, I like writing small articles or little webpages like the one you see now.
I was worried about what people might think.
I know I usually shouldn't worry about what people think, but I do sometimes do that.</p>
<p>People might think that I have an amateurish website and low-quality articles.
They might have opinions about that strange person in my YouTube videos, sounding kind of robotic.</p>
<p>But then I started looking at which of my pages get visits.
And usually, none of my pages get any visits at all.
So I guess there is nothing to worry about.
I should just do what I like, nobody is having an opinion of me.</p>
<p>There are some search terms where I am in the top five of Google search results.
Those people get value from me and can go on with their lives without being stuck on something.
I didn't try to get a good Google ranking for those search terms.
I was just doing what I like.
So mission accomplished, I had fun and some people found a good result for their search terms.</p>
<p>These days I also have a few informational videos online on YouTube, and YouTube provides analytics.
Some people open them, but most don't actually watch a useful amount of the video.
I was thinking, that is okay, because I like having made them, and I learned about making videos and doing basic editing.</p>
<p>Then, someone out there put a like on one of the videos.
That right there, is another mission accomplished.
Someone was helped by my video. I think that makes it worth the time I have spent on making videos.</p>
<p>I think that it is important to realize that people will only have an opinion of you if they are thinking about you,
and they probably are not thinking about you at all.
They might click away what you made, but I do that too. That is fine.</p>
<p>There is going to be someone out there who is a good fit for what you made.
They will like it, and even though you might not know they exist,
and they forget about you the moment a notification comes in on their phone,
something good was accomplished there.</p>
<p>So if you like making harmless things, think of putting them out there.
You might enjoy it like I do, and someone out there will enjoy it too at some point.
A true win-win situation.</p>What is HTTP, actually? What happens when we access a web page2022-01-08T00:00:00+01:002022-01-08T00:00:00+01:00Taco Steemerstag:tacosteemers.com,2022-01-08:/articles/http.html<p>What is HTTP, actually? A short description the Hypertext Transfer Protocol and what happens when we want to access a web page.</p><p>In this article we take a look at HTTP based on what people most often use it for: requesting a webpage.</p>
<p>HTTP stands for <a href="https://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol">HyperText Transfer Protocol</a>.
This webpage is served to your browser by a webserver.
Due to the Hypertext Transfer Protocol both your browser and the webserver know what they have to do when you decide to
visit a website.
A protocol is a set of rules that explain how to handle a situation. <a href="https://en.wikipedia.org/wiki/HTTPS">HTTPS is HTTP with added Security</a>.
Hypertext is text linked to by hyperlinks, the links we click to move from webpage to webpage.
Webpages today are not purely text anymore; they also contain images, videos and sounds.
For that reason we often talk about hypermedia instead of hypertext.
The reason they call this media "hyper" is because it is interactive, as opposed to text on paper.
We usually call these documents or <a href="https://en.wikipedia.org/wiki/Web_resource">resources</a> instead of hypermedia.
In this article I will use an example with a document.</p>
<p>HTTP is used for internet communications. It is an application layer protocol, meaning that it is used between applications.
In the model for computer to computer communications, the Open Systems Interconnection model,
<a href="https://en.wikipedia.org/wiki/OSI_model#Layer_architecture">application layer protocols sit at layer 7</a>, the highest layer.
Besides HTTP there are a lot more details to how this page was delivered to you!</p>
<p>In the protocol there is a client and a server.
In this example my browser is the client and whichever computer contains my website is the server.
The client sends an HTTP request to the server. The server gives a response.
The client may request a specific document, using a specific version of HTTP.
That is what we call the GET request, and it is the simplest example.</p>
<p>This is what a request from my browser for my webpage looks like:</p>
<div class="highlight"><pre><span></span><code><span class="err">GET https://tacosteemers.com/articles.html</span>
</code></pre></div>
<p>The client can also add more details to their request, called <a href="https://en.wikipedia.org/wiki/List_of_HTTP_header_fields">header fields</a>.
Examples are the user's login information or that they prefer not to be tracked.</p>
<p>My browser has added many details to the request.
Here are some of the request header fields:</p>
<div class="highlight"><pre><span></span><code><span class="n">Host</span><span class="o">:</span> <span class="n">tacosteemers</span><span class="o">.</span><span class="na">com</span>
<span class="n">Accept</span><span class="o">:</span> <span class="n">text</span><span class="sr">/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/</span><span class="o">*;</span><span class="n">q</span><span class="o">=</span><span class="mf">0.8</span>
<span class="n">If</span><span class="o">-</span><span class="n">Modified</span><span class="o">-</span><span class="n">Since</span><span class="o">:</span> <span class="n">Sun</span><span class="o">,</span> <span class="mi">02</span> <span class="n">Jan</span> <span class="mi">2022</span> <span class="mi">20</span><span class="o">:</span><span class="mi">56</span><span class="o">:</span><span class="mi">04</span> <span class="n">GMT</span>
</code></pre></div>
<p>The <a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Host">Host request header</a> is mandatory and clarifies
which host we want to place a request with.
This may seem redundant because we are already placing our request at this host when we <code>GET https://tacosteemers.com/articles.html</code>.
However, it is not redundant. My website is served to the client from a server that serves up many websites.
That server does not have the name <code>tacosteemers.com</code>. Instead it will be accessed by a name that may look like
<code>web887.dc3.example.com</code>.
By the time the GET request arrives at this server somewhere in a datacenter, passing through many different computers and routers,
that initial request will have been translated several times to reflect hostnames encountered along the way.
The server needs the Host request header to know which website we want.</p>
<p>The <a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Accept">Accept request header</a>
let's the server know what kind of documents the client can accept.
<a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/If-Modified-Since">If-Modified-Since</a> means that the client only wants to receive
the document if it has been changed since the given time. If the client sends this it means that it already has a copy
from that date and time on disk, and if the server doesn't have a newer version it will tell the client in it's response.
The server's response will not include the document in that situation.</p>
<p>The server responds with:</p>
<ul>
<li>A status code</li>
<li>A list of response headers</li>
<li>The response body, which contains the actual document</li>
</ul>
<p>The <a href="https://en.wikipedia.org/wiki/List_of_HTTP_status_codes">statuscode</a> for this response is 200, which simply means "OK".
If the document on the server was not newer than the browser indicated with <code>If-Modified-Since</code> the server would have given
statuscode 304 "Not Modified" and the response body would have been empty.</p>
<p>Some of the response headers are:</p>
<div class="highlight"><pre><span></span><code><span class="c">Content-Length: 30224</span>
<span class="nt">Content-Type:</span><span class="w"> </span><span class="nl">text</span><span class="dl">/</span><span class="nl">html</span><span class="w"></span>
<span class="c">Last-Modified: Mon, 03 Jan 2022 06:33:28 GMT</span>
</code></pre></div>
<p>The first two response headers tell the client how to interpret the contents of the response body.
<code>Last-Modified</code> tells us that the document has indeed changed since we last accessed it.
My hosting company has also added two custom response headers that tells us which
webserver and loadbalancer this request and response have passed through.
They probably do this to allow them to diagnose problems in their network.</p>
<p>You may have noticed that we used the word <code>GET</code>, and wondered if there are any other words.
We call these request methods. <a href="https://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol#Request_methods">There are nine request methods</a>.</p>
<ul>
<li>GET</li>
<li>HEAD, a GET request without getting the body in the response</li>
<li>POST, where the client sends data to the server for further processing</li>
<li>PUT, where the client sends data that overwrites something that already exists on the server, that could be something that has been POST-ed earlier.</li>
<li>DELETE</li>
<li>CONNECT, <a href="https://en.wikipedia.org/wiki/HTTP_tunnel#HTTP_CONNECT_method">a more complicated request method</a></li>
<li>OPTIONS, where the client asks the server what options there are for communicating with the server or a specific resource</li>
<li>TRACE, this method is new to me, apparently it is used for troubleshooting and will give back information about what the request
looked like to the server after travelling through all the intermediary systems</li>
<li>PATCH, for sending instructions on how to partially update a resource or document</li>
</ul>
<p>I haven't used PATCH but I imagine that PATCH is handy for when the client doesn't have the document
or doesn't want to send it because it is too large, but the client does know what modifications need to be made to the document.</p>
<p>Here is <a href="https://datatracker.ietf.org/doc/html/rfc7540">the proposal for the current HTTP version, HTTP/2</a>, from May 2015.
The first eight request methods are described in <a href="https://datatracker.ietf.org/doc/html/rfc7231">the earlier HTTP/1 proposal</a>.
The PATCH method <a href="https://datatracker.ietf.org/doc/html/rfc5789#section-2">is described in a separate specification</a>.</p>Starting up: low friction, minimal process and minimal tools2022-01-02T00:00:00+01:002022-01-02T00:00:00+01:00Taco Steemerstag:tacosteemers.com,2022-01-02:/articles/2022-02-01-starting_up_low_friction.html<p>Preserving project speed and enjoyment -- We can keep up a high iteration speed by keeping things simple and only introducing tools and processes when we absolutely have to.</p><p>When starting a new project we may not need everything that we need when working on established projects.
We can probably wait with setting up a build server and CI/CD, uploading a build from our computer instead.
We may not need a project management software. Perhaps it is just you, or you and two other people. You can use a chat app, emails and phone calls. A decision log can be noted in a notepad and emailed to the participants.</p>
<p>A few years ago (time flies, truly), I was working on the software side of a hardware platform project that took sensor input from an Arduino board, state output from (VR) games, and input from an existing piece of hardware that had an Arduino board attached to it that spoke the XBox controller protocol.
Together, this platform can simulate almost any type of vehicle.
I found it to be an interesting project, and it was refreshing to work with hardware again. I also enjoyed working with Arduino.</p>
<p>The other people on the project needed an easy way to receive my code. After they received my code it should be clear what they needed to do with it.
The project was without any income and thus without any infrastructure. The people working on it were not users of version control platforms such as Gitlab or GitHub. I didn't think that introducing them to the modern software development process was a good use of our time.
Instead, I resorted to scraping files together and zipping them up in an ad-hoc way. Sometimes several times a day, as what I had written in the morning before work would have been tested during the day, and I sent in improvements after work.</p>
<p>The process of sending updates was tedious and error-prone, so I created a tool to help gather files, remove unwanted files and zip them up. I created that tool when I felt I needed it.
I did not create an installer and updater. Me emailing the file and the other person unzipping the file was all that was required. Simple, frictionless and very little development effort was required for making builds and installing them.
Which is good, because that is not the focus at the start of a project.</p>
<p>We can set up a CI/CD or a regular build and release process when <em>not having them</em> provides too much friction.
Before we reach that moment we can work with a loosely described standard operating procedure and some code to automate the annoying bits.</p>
<p>One of the other participants set up a Trello board. We put effort in to filling it up. In the end though, I felt the other participants didn't use it.
They didn't record their thoughts or test results there, even if I did update my development tasks there.
I chatted and called a lot with the main person behind the project, who had the hardware.
I made notes of their conclusions and would iterate in the evening or in the morning.
We just didn't need project management software yet.</p>
<p>In the end the project didn't succeed as a business.
But working on it was a ton of fun.
It was as frictionless as it could have been.
By skipping a lot of administrative and non-core activities I was able to focus on what mattered: the minimum viable product.</p>
<p>My conclusion is that when starting a project we don't need to worry about what we will need to have in the future.
Instead, I prefer to focus on the project goals and creating standard operating procedures for the current tasks.
When friction appears we can perhaps automate that away, but we shouldn't get sidetracked on that.</p>Title keywords make company websites easier to find later on2021-12-27T00:00:00+01:002021-12-27T00:00:00+01:00Taco Steemerstag:tacosteemers.com,2021-12-27:/articles/2021-12-27-keywords-in-website-title.html<p>It is a good idea to put some keywords in your landing page title. Without keywords in the title I can't find you anymore next week when I need you.</p><p>It is a good idea to put some keywords in your landing page title.
Without keywords in the title I can't find you anymore next week when I need you.
I mean the title of the page, not the title displayed <em>in</em> the page. In HTML terms, I mean the <code><title></code> tag in the <code><head></code>.</p>
<p>Some websites just state their name, like this made-up company example "FlowerHill Inc.".
Why is this a problem? They clearly state their name.
Yes, but the lack of keywords makes it difficult to find them again.</p>
<p>Let's say that this is a made-up landscaping company in my area.
The week after I first visited that website I find myself in need of a landscaping company.
This is a great opportunity for the company that I just found last week!
They are immediately on the top of my mind.</p>
<p>But there is a problem.
I search my browser history and my bookmarks.
I search them for these terms:
- "landscaping"
- the three other local terms for landscaping
- colloquial name for my geographical area
- nearby town that they might have been located in
- other towns...
The page I am looking for isn't coming up.</p>
<p>By now I might remember that other company someone mentioned they used, and go with that instead.
Or I start a web search for "landscaping in ...", where many other companies will show up in the web search.</p>
<p>Initially I wanted to use the initial website, their services or product.
Unfortunately I just couldn't find them anymore.
So let's make sure that we have some keywords in the landing page title.
For example: "FlowerHill landscaping and rainwater management -- Hill Valley City".</p>Catching an exception from an annotation on a JAX-RS resource2021-11-25T00:00:00+01:002021-11-25T00:00:00+01:00Taco Steemerstag:tacosteemers.com,2021-11-25:/articles/catching_exceptions_from_annotated_resources.html<p>An exception resulting from an annotation cannot be caught in a regular try / catch block because the result of the annotation is computed before our code is executed.</p><p>An exception resulting from an annotation cannot be caught in a regular try / catch block because the result of the annotation is computed before our code is executed.</p>
<h2 id="example">Example</h2>
<p>Take this resource for example:</p>
<div class="highlight"><pre><span></span><code><span class="nv">@Resource</span><span class="w"></span>
<span class="nv">@Path</span><span class="p">(</span><span class="ss">"/example"</span><span class="p">)</span><span class="w"></span>
<span class="k">public</span><span class="w"> </span><span class="k">class</span><span class="w"> </span><span class="n">ExampleResource</span><span class="w"> </span><span class="err">{</span><span class="w"></span>
<span class="w"> </span><span class="nv">@GET</span><span class="w"></span>
<span class="w"> </span><span class="nv">@Path</span><span class="p">(</span><span class="ss">"/"</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="nv">@AnnotationThatThrowsException</span><span class="w"></span>
<span class="w"> </span><span class="k">public</span><span class="w"> </span><span class="n">Response</span><span class="w"> </span><span class="n">getExample</span><span class="p">()</span><span class="w"> </span><span class="err">{</span><span class="w"></span>
<span class="w"> </span><span class="k">try</span><span class="w"> </span><span class="err">{</span><span class="w"></span>
<span class="w"> </span><span class="k">return</span><span class="w"> </span><span class="n">Response</span><span class="p">.</span><span class="n">status</span><span class="p">(</span><span class="n">Response</span><span class="p">.</span><span class="n">Status</span><span class="p">.</span><span class="n">NO_CONTENT</span><span class="p">).</span><span class="n">build</span><span class="p">();</span><span class="w"></span>
<span class="w"> </span><span class="err">}</span><span class="w"> </span><span class="k">catch</span><span class="w"> </span><span class="p">(</span><span class="k">Exception</span><span class="w"> </span><span class="n">e</span><span class="p">)</span><span class="w"> </span><span class="err">{</span><span class="w"></span>
<span class="w"> </span><span class="k">return</span><span class="w"> </span><span class="n">Response</span><span class="p">.</span><span class="n">serverError</span><span class="p">().</span><span class="n">entity</span><span class="p">(</span><span class="n">e</span><span class="p">.</span><span class="n">getMessage</span><span class="p">()).</span><span class="n">build</span><span class="p">();</span><span class="w"></span>
<span class="w"> </span><span class="err">}</span><span class="w"></span>
<span class="w"> </span><span class="err">}</span><span class="w"></span>
<span class="err">}</span><span class="w"></span>
</code></pre></div>
<p>Here the code that is run by the annotation processor for AnnotationThatThrowsException, throws an exception.
This happens before we actually enter the method.
Our try / catch block cannot catch the exception.
How do we handle this?</p>
<h2 id="catching-exceptions-with-a-filter">Catching exceptions with a filter</h2>
<p>In this example we know about an AccessDeniedException
that can be thrown from code that is run because of
an annotation. We want to return a 403 Forbidden status code
when that exception has been thrown.</p>
<p>Depending on your situation, the exception may already
have been caught and rethrown. For that reason we also
check any throwable to see if it's cause is an
AccessDeniedException.</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">javax.enterprise.context.ApplicationScoped</span><span class="p">;</span>
<span class="kn">import</span> <span class="nn">javax.servlet.Filter</span><span class="p">;</span>
<span class="kn">import</span> <span class="nn">javax.servlet.FilterChain</span><span class="p">;</span>
<span class="kn">import</span> <span class="nn">javax.servlet.ServletRequest</span><span class="p">;</span>
<span class="kn">import</span> <span class="nn">javax.servlet.ServletResponse</span><span class="p">;</span>
<span class="kn">import</span> <span class="nn">javax.servlet.annotation.WebFilter</span><span class="p">;</span>
<span class="kn">import</span> <span class="nn">javax.servlet.http.HttpServletResponse</span><span class="p">;</span>
<span class="kn">import</span> <span class="nn">java.io.IOException</span><span class="p">;</span>
<span class="nd">@ApplicationScoped</span>
<span class="nd">@WebFilter</span><span class="p">(</span><span class="n">filterName</span> <span class="o">=</span> <span class="s2">"ExceptionHandlingFilter"</span><span class="p">,</span> <span class="n">urlPatterns</span> <span class="o">=</span> <span class="s2">"/*"</span><span class="p">)</span>
<span class="n">public</span> <span class="k">class</span> <span class="nc">ExceptionHandlingFilter</span> <span class="n">implements</span> <span class="n">Filter</span> <span class="p">{</span>
<span class="n">private</span> <span class="n">static</span> <span class="n">final</span> <span class="n">Logger</span> <span class="n">LOGGER</span>
<span class="o">=</span> <span class="n">LoggerFactory</span><span class="o">.</span><span class="n">getLogger</span><span class="p">(</span><span class="n">ExceptionHandlingFilter</span><span class="o">.</span><span class="n">class</span><span class="p">);</span>
<span class="nd">@Override</span>
<span class="n">public</span> <span class="n">void</span> <span class="n">doFilter</span><span class="p">(</span><span class="n">ServletRequest</span> <span class="n">request</span><span class="p">,</span> <span class="n">ServletResponse</span> <span class="n">response</span><span class="p">,</span> <span class="n">FilterChain</span> <span class="n">chain</span><span class="p">)</span>
<span class="n">throws</span> <span class="n">IOException</span> <span class="p">{</span>
<span class="k">try</span> <span class="p">{</span>
<span class="n">chain</span><span class="o">.</span><span class="n">doFilter</span><span class="p">(</span><span class="n">request</span><span class="p">,</span> <span class="n">response</span><span class="p">);</span>
<span class="p">}</span> <span class="n">catch</span> <span class="p">(</span><span class="n">AccessDeniedException</span> <span class="n">e</span><span class="p">)</span> <span class="p">{</span>
<span class="p">((</span><span class="n">HttpServletResponse</span><span class="p">)</span> <span class="n">response</span><span class="p">)</span>
<span class="o">.</span><span class="n">sendError</span><span class="p">(</span><span class="n">HttpServletResponse</span><span class="o">.</span><span class="n">SC_FORBIDDEN</span><span class="p">,</span> <span class="n">e</span><span class="o">.</span><span class="n">getMessage</span><span class="p">());</span>
<span class="p">}</span> <span class="n">catch</span> <span class="p">(</span><span class="n">Throwable</span> <span class="n">t</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">t</span><span class="o">.</span><span class="n">getCause</span><span class="p">()</span> <span class="o">!=</span> <span class="n">null</span> <span class="o">&&</span> <span class="n">t</span><span class="o">.</span><span class="n">getCause</span><span class="p">()</span> <span class="n">instanceof</span> <span class="n">AccessDeniedException</span><span class="p">)</span> <span class="p">{</span>
<span class="p">((</span><span class="n">HttpServletResponse</span><span class="p">)</span> <span class="n">response</span><span class="p">)</span>
<span class="o">.</span><span class="n">sendError</span><span class="p">(</span><span class="n">HttpServletResponse</span><span class="o">.</span><span class="n">SC_FORBIDDEN</span><span class="p">,</span> <span class="n">t</span><span class="o">.</span><span class="n">getCause</span><span class="p">()</span><span class="o">.</span><span class="n">getMessage</span><span class="p">());</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
<span class="p">((</span><span class="n">HttpServletResponse</span><span class="p">)</span> <span class="n">response</span><span class="p">)</span>
<span class="o">.</span><span class="n">sendError</span><span class="p">(</span><span class="n">HttpServletResponse</span><span class="o">.</span><span class="n">SC_INTERNAL_SERVER_ERROR</span><span class="p">,</span> <span class="n">t</span><span class="o">.</span><span class="n">getMessage</span><span class="p">());</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div>
<p>The filter should be registered automatically due to the filter's WebFilter annotation. If for some reason that doesn't work one can try registering it like so:</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">javax.inject.Inject</span><span class="p">;</span>
<span class="kn">import</span> <span class="nn">javax.servlet.DispatcherType</span><span class="p">;</span>
<span class="kn">import</span> <span class="nn">javax.servlet.FilterRegistration</span><span class="p">;</span>
<span class="kn">import</span> <span class="nn">javax.servlet.ServletContainerInitializer</span><span class="p">;</span>
<span class="kn">import</span> <span class="nn">javax.servlet.ServletContext</span><span class="p">;</span>
<span class="kn">import</span> <span class="nn">javax.servlet.ServletException</span><span class="p">;</span>
<span class="kn">import</span> <span class="nn">java.util.EnumSet</span><span class="p">;</span>
<span class="kn">import</span> <span class="nn">java.util.Set</span><span class="p">;</span>
<span class="n">public</span> <span class="k">class</span> <span class="nc">FilterInitializer</span> <span class="n">implements</span> <span class="n">ServletContainerInitializer</span> <span class="p">{</span>
<span class="nd">@Inject</span>
<span class="n">private</span> <span class="n">ExceptionHandlingFilter</span> <span class="n">exceptionHandlingFilter</span><span class="p">;</span>
<span class="nd">@Override</span>
<span class="n">public</span> <span class="n">void</span> <span class="n">onStartup</span><span class="p">(</span><span class="n">Set</span><span class="o"><</span><span class="n">Class</span><span class="o"><</span><span class="err">?</span><span class="o">>></span> <span class="n">c</span><span class="p">,</span> <span class="n">ServletContext</span> <span class="n">ctx</span><span class="p">)</span> <span class="n">throws</span> <span class="n">ServletException</span> <span class="p">{</span>
<span class="n">FilterRegistration</span><span class="o">.</span><span class="n">Dynamic</span> <span class="n">reg</span> <span class="o">=</span>
<span class="n">ctx</span><span class="o">.</span><span class="n">addFilter</span><span class="p">(</span><span class="s2">"ExceptionHandlingFilter"</span><span class="p">,</span> <span class="n">exceptionHandlingFilter</span><span class="p">);</span>
<span class="n">reg</span><span class="o">.</span><span class="n">setAsyncSupported</span><span class="p">(</span><span class="n">true</span><span class="p">);</span>
<span class="n">reg</span><span class="o">.</span><span class="n">addMappingForUrlPatterns</span><span class="p">(</span><span class="n">EnumSet</span><span class="o">.</span><span class="n">of</span><span class="p">(</span><span class="n">DispatcherType</span><span class="o">.</span><span class="n">REQUEST</span><span class="p">),</span> <span class="n">false</span><span class="p">,</span> <span class="s2">"/*"</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div>
<h2 id="exception-mappers">Exception Mappers</h2>
<p><a href="https://developer.jboss.org/docs/DOC-48310">Exception mappers</a> are another way to handle exceptions when using JAX-RS.
I believe these are the recommended solution for handling previously uncaught exceptions, and it is worth looking in to them. However, I haven't gotten exception mappers to work for an exception from an annotation.</p>Checking checksums2021-09-22T00:00:00+02:002021-09-22T00:00:00+02:00Taco Steemerstag:tacosteemers.com,2021-09-22:/articles/checking-checksums.html<p>This page is about file checksums for situations where the distributor of the file also provides the checksum. If available, we always want to compare a given checksum with the checksum of the file we downloaded.</p><div class="toc"><span class="toctitle">Table of contents:</span><ul>
<li><a href="#topic">Topic</a></li>
<li><a href="#sha-checksums">SHA checksums</a></li>
<li><a href="#first-create-or-check-your-checksum-file">First, create or check your checksum file</a></li>
<li><a href="#using-sha256sum-gnu">Using sha256sum (GNU)</a></li>
<li><a href="#using-shasum-more-cross-platform">Using shasum (more cross-platform)</a></li>
<li><a href="#using-openssl">Using OpenSSL</a></li>
<li><a href="#comparing-hashes-by-hand">Comparing hashes by hand</a></li>
</ul>
</div>
<h1 id="topic">Topic</h1>
<p>This page is about file hashes (checksums) for situations where the distributor of the file also provides the checksum.</p>
<p>If you want to use checksums in your own code you might want to look at <a href="https://en.wikipedia.org/wiki/Cyclic_redundancy_check">the CRC-32 algorithm</a>.</p>
<p>The examples here are for SHA 256 checksums but can easily be adjusted to SHA 512, for example. OpenSSL is also easy to use for any algorithms.</p>
<h1 id="sha-checksums">SHA checksums</h1>
<p>If available, we always want to compare a given checksum with the checksum of the file we downloaded.
This is to make sure nothing went wrong during transit, in memory or in storage.
Another reason is to make it less likely we fall for a man-in-the-middle attack. Checking the checksum for that
reason will only work if the man in the middle is not in a position to manipulate the page that lists the checksum.</p>
<h1 id="first-create-or-check-your-checksum-file">First, create or check your checksum file</h1>
<p>Before we run a checksum command on a file we need to have a corresponding checksum file
from the distributor of the file.
For example, I download a gradle binary distribution and the corresponding checksum file:</p>
<div class="highlight"><pre><span></span><code><span class="c">https://services.gradle.org/distributions/gradle-6.9.1-bin.zip</span>
<span class="c">https://services.gradle.org/distributions/gradle-6.9.1-bin.zip.sha256</span>
</code></pre></div>
<p>The contents of this checksum file is only the has, as we see here:</p>
<div class="highlight"><pre><span></span><code>$ cat gradle-6.9.1-bin.zip.sha256
8c12154228a502b784f451179846e518733cf856efc7d45b2e6691012977b2fe
</code></pre></div>
<p>The checksum tools that I use on Linux and macOS expect a format like the following:</p>
<div class="highlight"><pre><span></span><code><span class="err">8c12154228a502b784f451179846e518733cf856efc7d45b2e6691012977b2fe gradle-6.9.1-bin.zip</span>
</code></pre></div>
<p>Note that there are two spaces used here.
Apparently the missing character in between the spaces means the file will be interpreted as regular text, which is what we want.</p>
<p>Let's create that file now, so we can use it in our examples:</p>
<div class="highlight"><pre><span></span><code>$ <span class="nb">echo</span> <span class="s2">"</span><span class="k">$(</span>cat gradle-6.9.1-bin.zip.sha256<span class="k">)</span><span class="s2"> gradle-6.9.1-bin.zip"</span> > gradle-6.9.1-bin.zip.sha256.checksum
</code></pre></div>
<h1 id="using-sha256sum-gnu">Using sha256sum (GNU)</h1>
<p>sha256sum is available on GNU/Linux distributions, as part of the coreutils.
As far as I know, sha256sum is not available on brew or macports.</p>
<div class="highlight"><pre><span></span><code>$ cat gradle-6.9.1-bin.zip.sha256.checksum <span class="p">|</span> sha256sum --check
gradle-6.9.1-bin.zip: OK
</code></pre></div>
<p>With --status it only gives a 0 status code for success and 1 otherwise.
Useful for when you want to check the status code in scripts.</p>
<div class="highlight"><pre><span></span><code>$ cat gradle-6.9.1-bin.zip.sha256.checksum <span class="p">|</span> sha256sum --check --status
</code></pre></div>
<p>We can also use it to create a checksum:</p>
<div class="highlight"><pre><span></span><code>$ sha256sum gradle-6.9.1-bin.zip
8c12154228a502b784f451179846e518733cf856efc7d45b2e6691012977b2fe gradle-6.9.1-bin.zip
</code></pre></div>
<h1 id="using-shasum-more-cross-platform">Using shasum (more cross-platform)</h1>
<p>shasum is available to Linux distributions and macOS. On macOS it needs to be installed with brew or macports.</p>
<p>We need to indicate which algorithm to use, with the -a argument.</p>
<div class="highlight"><pre><span></span><code>$ cat gradle-6.9.1-bin.zip <span class="p">|</span> shasum -a <span class="m">256</span> -c gradle-6.9.1-bin.zip.sha256.checksum
gradle-6.9.1-bin.zip: OK
</code></pre></div>
<p>Returning a statuscode works the same as it does with sha256sum:</p>
<div class="highlight"><pre><span></span><code>$ cat gradle-6.9.1-bin.zip <span class="p">|</span> shasum -a <span class="m">256</span> -c gradle-6.9.1-bin.zip.sha256.checksum --status
</code></pre></div>
<p>As does creating a checksum:</p>
<div class="highlight"><pre><span></span><code>$ shasum -a <span class="m">256</span> gradle-6.9.1-bin.zip
8c12154228a502b784f451179846e518733cf856efc7d45b2e6691012977b2fe gradle-6.9.1-bin.zip
</code></pre></div>
<h1 id="using-openssl">Using OpenSSL</h1>
<p><a href="https://www.garron.me/en/bits/how-to-md5sum-mac-os-x.html">openssl can also generate the hash</a> for us.</p>
<div class="highlight"><pre><span></span><code>$ openssl sha256 gradle-6.9.1-bin.zip
SHA256<span class="o">(</span>gradle-6.9.1-bin.zip<span class="o">)=</span> 8c12154228a502b784f451179846e518733cf856efc7d45b2e6691012977b2fe
</code></pre></div>
<h1 id="comparing-hashes-by-hand">Comparing hashes by hand</h1>
<p>With sha356sum and shasum we can let the tool compare the hashes.
Maybe we are using a tool that doesn't do the comparison for us, like openssl.
In that case comparing hashes can be easy with python or any other scripting language.
We start the console, and copy and paste the hashes to do a string comparison.</p>
<div class="highlight"><pre><span></span><code>$ python
>>> <span class="s2">"the hash"</span> <span class="o">==</span> <span class="s2">"the hash"</span>
True
>>> quit<span class="o">()</span>
</code></pre></div>
<p>We do need to make sure we did copy and paste the two different hashes, instead of pasting the one hash twice. One way to be sure is copying and pasting something else before we copy and paste the second hash. </p>Thinking about how to organise my writing2021-09-18T00:00:00+02:002021-09-18T00:00:00+02:00Taco Steemerstag:tacosteemers.com,2021-09-18:/articles/2021-09-18-thinking_about_how_to_organise_my_writing.html<p>Thinking out loud about where to place notes, as opposed to blog posts, and how to get them easy to find.</p><p>I am often conflicted on where to keep my notes.
Specifically, notes that can be public and that I would like to be able to access from work devices as well as private devices.
Things I know I will want to look up again in the future.
Preferably I would just post them here, on my website.
However, I often don't, for a number of reasons.</p>
<ul>
<li>Information can easily become outdated. </li>
<li>Sharing can feel uncomfortable because I might write something that is incorrect.</li>
<li>I think the notes are not high-quality, in-depth or useful enough for others.</li>
<li>I think nobody will find them anyway, so why share.</li>
<li>I'm not sure where to place them. Should I place them as blog posts or separate notes?</li>
</ul>
<p>As a result I tend to lose these type of notes, or I don't even bother to create them.</p>
<p>In this post I want to talk about the last point.
Where would I place them?
Notes will be added to over time. Blog posts don't expand in size, though there may be corrections.
Notes don't have a story, they usually consist of a few bullet points, command examples and documentation links.
I don't feel that blog posts and notes are a good mix.</p>
<p>Notes and blog posts could both be created as blog posts, and then separated by categories.
They would both show up in posts lists and share tags. As a result the notes would be easy to find.
One downside might be that blog posts don't have any hierarchical sorting. They have only chronological sorting.
I could work around that by just using one page per topic, expanding the page over time,
and then linking the topics with tags.<br>
The main benefit of adding my notes as blog posts would be that they can then share tags with the actual blog posts.
This improves the discoverability of the notes as well as the blog posts.
Some notes are already on this website, on the page "Code Notes and Snippets", which itself is hidden on another page.
Not easy to find, and it doesn't feel likely someone else might find the information when they need it.</p>
<p>It would be great if it was easier to find what I wrote.
For that reason I think I will start turning my notes in to blog posts. Even though conceptually they are not a good match. The blog post listing and the category pages will make the notes easier to find.</p>
<p>There are some open questions:</p>
<ul>
<li>How can I leave them out of the automatically generated RSS feed?</li>
<li>What will I do with the existing notes?</li>
<li>How can I make it really easy to add and edit notes? Currently, my website is statically generated from a specific computer.</li>
<li>How can I make my non-public writing accessible to myself on my private devices as well as my work devices?</li>
<li>What category name should I use for the notes? I already use "General Notes" and "Technical Notes" for my blog posts. Maybe something like "Quick Notes"?</li>
</ul>
<p>Perhaps I should not get too many categories. The current "Technical Notes" could move to "General Notes". The quick notes could then move to "Technical notes".</p>
<p>It does occur to me that I am perhaps making things too difficult.</p>Automatically blocking a git commit if we detect a known mistake2021-08-10T00:00:00+02:002021-08-10T00:00:00+02:00Taco Steemerstag:tacosteemers.com,2021-08-10:/articles/2021-08-10-automatically_blocking_a_git_commit_if_we_detect_a_known_mistake.html<p>It is possible to automatically block a git commit if it makes a known mistake. This can be done with a pre-commit hook.</p><p>Recently I made a mistake.
As a result I was thinking about something I hadn't done
in a while; making
<a href="https://githooks.com/">a git pre-commit hook</a>.
A pre-commit hook is code that runs before a commit,
and can block the commit if there are any problems.
People write git pre-commit hooks to help detect
problems such as files with cross-platform encoding
and line-ending issues,
<span id="footnote-1-return">filenames that differ
only in capitalization <a href="#1">(1)☟</a>,</span>
<span id="footnote-2-return">and files that are not
allowed to end up in a repository <a href="#2">(2)☟</a>.</span></p>
<p>Another problem that we can fix with a git pre-commit hook is mistakes in URL linking spotted by Pelican. This can be found in the build output: <code>WARNING: Unable to find '/articles/abc.html', skipping url replacement.</code> We can check the build output for this type of problem. One way is to display the build output to the user but store it in a file as well. In the pre-commig hook we then check the build output for this type of problem.</p>
<p>Many pre-commit hooks have been shared online. You may be able to find some that are useful to you.
For example, you will find various indentation related pre-commit hooks if you do a web search on "git pre-commit hook indentation".</p>
<p>What happened is that I updated my website, changed my mind about the title of
the article I had just added, changed it, and updated
the website again.
The problem is that I also changed the slug to match the
tile, which is what the URL is based on. This makes it
look like a different article.
It is possible that the article appeared twice some
people's feeds, as it did in mine.</p>
<p>Not a big deal. Except that I don't like making mistakes. Especially if they are avoidable.</p>
<p>To make this type of mistake less likely I am
adding two small pieces of automation.
First is adding a small script that will detect if
the RSS feed has changed even though it still contains
the same number of article titles.
This runs as a git pre-commit hook.
<a href="/files-posts/code/pre-commit/rss_feed_equality_check.sh">This is what I am testing now</a>.
The basic idea works, but it does depend on
using one commit per article.
It does not work if we commit a new article at the same
time that we change the slug on an existing article.
To me that is acceptable; I already prefer that type of
commit, where each commit represents one topic.</p>
<p>Second is adding some scripting that will work as a pelican pre-upload hook.
The script will need to stop the upload if there are any modified
files left.
In other words, all files have been committed before we
can upload the new website content.
This way we know for sure the git pre-commit hook
has been run.</p>
<h4 id="footnotes">Footnotes</h4>
<h4 id="1">1</h4>
<p>Filesystems used by Windows ignore capitalization in filenames. Other filesystems do not.
This results in problems when changing the capitalization of a version-controlled filename on a Windows computer.
This change may not end up in a commit. As a result a build may succeed locally, but fails on other computers. <a href="#footnote-1-return">☝</a></p>
<h4 id="2">2</h4>
<p>See my article
<a href="/articles/2020-10-23-some-files-and-information-should-not-be-in-source-control.html">"Some files and information should not be in source control"</a> for reference.
By creating a a pre-commit hook for these situations we can detect these types of files before we commit them. <a href="#footnote-2-return">☝</a></p>Avoid unmaintained and undifferentiated forks on your repository hosting profile2021-08-09T00:00:00+02:002021-08-09T00:00:00+02:00Taco Steemerstag:tacosteemers.com,2021-08-09:/articles/2021-08-09_avoid_unmaintained_and_undifferentiated_forks_on_your_repository_hosting_profile.html<p>By forking many projects some people end up with too much on their profile page. Their original projects are hard to find. If you want to check out a project you can make a local clone. When you fork a project online, you are offering that project in that state. People might take you up on your offer and start using it as-is. You have a responsibility there.</p><p>When viewing people's profiles on web applications centered around repository hosting services, such as Github, Gitlab and Bitbucket, we can find profiles that are full of forks of projects that this person does not seem to have done any meaningful work on.
I think this is because some people click the "fork me" button on any project that they want to play around with. That is not necessary, and it has a downside.</p>
<p>By forking many projects they end up with their original work mixed in with pages full of forked projects. Their handful of original projects are difficult to find among the many forks. To the casual profile visitor it looks as though they don't understand how version control can be used well together with these online platforms.</p>
<p>If you want to check out a project, just do "git clone" or equivalent on the main project. Create a local copy. Rebase it when you want to pull in updates. If you have changes you can go through that project's steps for contributors to get your changes included. This can be as simple as opening a pull request in whichever way is standard for that repository hosting service. Exact details will be different for each project. If it is unclear you might search for contact details of current contributors and ask them how to proceed. </p>
<p>If they don't want to merge your changes you can consider forking. When you do fork, you will have to keep your fork up to date. If not because of the feature updates then at least because of the security updates that may have been done on the original project. Note that updates to which versions of dependencies the project uses can also include security updates. As a result, security improvements may be mixed in with other types of updates.</p>
<p>When you fork a project online, you are offering that project in that state. People might take you up on your offer and start using your fork as-is, instead of the original. You have some responsibility there. </p>Diagramming can be a valuable tool for thinking as well as communication2020-11-22T00:00:00+01:002020-11-22T00:00:00+01:00Taco Steemerstag:tacosteemers.com,2020-11-22:/articles/2020-11-22-diagramming-can-be-a-valuable-tool.html<p>Words alone are sometimes not sufficient. Diagrams can help us understand a situation in new ways, assisting with our thinking as well as communication. If you don't diagram already, try it out!</p><div class="toc"><span class="toctitle">Table of contents:</span><ul>
<li><a href="#why-create-diagrams">Why create diagrams?</a></li>
<li><a href="#where-i-am-coming-from-on-this-topic">Where I am coming from on this topic</a></li>
<li><a href="#what-does-a-diagram-look-like">What does a diagram look like?</a></li>
<li><a href="#pick-any-tool-to-start">Pick any tool to start</a></li>
<li><a href="#automatic-class-diagrams">Automatic class diagrams</a></li>
<li><a href="#conclusion">Conclusion</a></li>
</ul>
</div>
<h2 id="why-create-diagrams">Why create diagrams?</h2>
<p>I don't do a lot of diagramming, but occasionally it can
be a valuable tool.
Diagrams can help us understand a situation in new ways.
Words alone may not be sufficient, especially if the
people who are trying to communicate come from
different backgrounds or don't share a native language.
Diagramming can be a fast and powerful way to communicate.
The diagram can be a tool for thinking,
individually or as a group.
It can be more than just a way of creating documentation.
It doesn't always need to be 100% correct when it is used
as a tool for generating new insights.
If you don't diagram already, try it out!</p>
<h2 id="where-i-am-coming-from-on-this-topic">Where I am coming from on this topic</h2>
<p>Some people may have learned to make diagrams while they were studying.
A lot of people who program or design systems did not study computer science
and may not be familiar with the practice.
The same is true for myself. </p>
<p>Over the years I became accustomed to reading diagrams.
Some people I worked with were old-school computer scientists who used
a specific notation <a href="https://en.wikipedia.org/wiki/Unified_Modeling_Language">(UML)</a>
that took some time getting used to, especially
as it wasn't really explained to me, and I did not have reference
material.
Occasionally I was tasked with creating class diagrams in specific UML tools,
but personally I don't find creating hand-made class diagrams useful because they become
outdated.
When I was thinking about a new system or trying to find the source
of bugs in a complex interaction I would doodle a bit on paper,
but I was not comfortable sharing these drawings because I did not like
the idea that people would think I didn't do them 'right'.
Due to all this I quickly forgot about diagramming when I moved on to
other assignments.</p>
<p>Today, I don't want 'right' to get in the way of 'useful'.
I don't think we need to go all-in with the
<a href="https://en.wikipedia.org/wiki/Unified_Modeling_Language">Unified Modelling Language (UML)</a>
to enjoy the benefits of the occasional quick diagram.</p>
<h2 id="what-does-a-diagram-look-like">What does a diagram look like?</h2>
<p>The <a href="https://plantuml.com/">official PlantUML site</a> and
<a href="https://real-world-plantuml.com/">Real World PlantUML</a>
have good examples that might get you inspired,
and show the basic building blocks.
A specific page I want to share is the
<a href="https://www.uml-diagrams.org/deployment-diagrams-overview.html">deployment diagrams overview on uml-diagrams.org</a>.
I think it is a good demonstration of how a lot of useful information
can be expressed in a diagram.</p>
<p>Of course there are also different situations where you might diagram, such as <a href="https://real-world-plantuml.com/umls/5564018462818304">describing use cases</a>.</p>
<p>My own <a href="https://tacosteemers.com/pages/plantuml-notes.html#system-diagram-example">PlantUML notes page</a>
shows the source of this example diagram.</p>
<img alt="uml diagram" class="uml" src="/images/59f9d46a.png"><p>Personally I don't bother too much with the 'correct' ways to draw something.
As long as the general idea is clear the diagram can be of value.
I do think that it will benefit the usefulness of your diagrams if
you look up the basic conventions for the type of diagram you want to make.
For example, there is a specific way to draw the connection between
<a href="https://www.uml-diagrams.org/interface.html">an interface and an implementation</a>.
It needs to be clear which element is the interface, which elements are using the interface
and which elements are implementing the interface.
It is good to use notation that people may already be familiar with and
can be understood in the future, when we have moved on.</p>
<h2 id="pick-any-tool-to-start">Pick any tool to start</h2>
<p>I like the <a href="https://plantuml.com/">PlantUML</a> tool. I use it as a command on the
commandline, and write the diagram in whichever plain text editor is at hand.
They also <a href="http://www.plantuml.com/plantuml/uml/">provide an online service</a>.
I think PlanUML is a low-complexity way to go.
Unfortunately error messages can be short and unclear.
The example pages I linked to earlier can be
handy to see how types of diagrams can be made. </p>
<p>Personally I feel that pencil and paper or any drawing program
can be a valid tool for diagramming.
Especially when we just want to get the basic ideas on paper,
in a way that can be used as a starting point for further thinking.</p>
<p>If we are using software it can be handy if
we also have a drawing program with support for multiple layers.
Then we can add notes on top in a different color,
like we might do with pencil and paper during a conversation
with our colleagues. The multiple layers then make it possible
to switch layers of notes on and off.</p>
<p>Pick any tool from a search result. Try out several if you like.</p>
<h2 id="automatic-class-diagrams">Automatic class diagrams</h2>
<p>Sometimes we want to have class diagrams for existing codebases.
Luckily many integrated development environments support creating them automatically.
I have done so in the past with Microsoft's IDEs.
<a href="https://www.jetbrains.com/help/idea/class-diagram.html">IntelliJ IDEA</a>
also seems to support automatic class diagrams.</p>
<h2 id="conclusion">Conclusion</h2>
<p>Diagramming can be a great way to jot down your thoughts,
and offload whichever system design you are thinking about
to paper, or computer.
It allows us to think about the system in new ways.
This is difficult to explain in words, and best experienced by yourself.
If you don't see the use yet, just keep it in mind for the next time you
need to communicate your ideas, or analyze a system.</p>On clarifying the status of demoed products2020-10-25T02:00:00+01:002020-10-25T02:00:00+01:00Taco Steemerstag:tacosteemers.com,2020-10-25:/articles/2020-10-25-on-clarifying-the-status-of-demoed-products.html<p>Giving a demonstration of a future product or feature can be a great way to check if development is on the right track. Unfortunately stakeholders don't always understand that a demo can be very far from a finished product.</p><p>Giving a demonstration of a possible product or feature can be a great way to get feedback
and check if development is on the right track.
Unfortunately stakeholders and customers don't always understand that a demo can be very far from a finished product.</p>
<p>A discussion of this phenomenon can be found <a href="https://news.ycombinator.com/item?id=23835918">here on Hacker News</a>.</p>
<p>Personally I can't recall having had a lot of issues with this when the customer is inside my own (technical)
organisation. It did get me thinking.
It is important to tell our outward-facing colleagues that based on this demo
we cannot make any promises to anyone outside the organisation.
Unfortunately a generic statement like that may not be taken serious, because it looks like standard boilerplate.
Perhaps an analogy to the physical world might help?
"What we are showing today is similar to an architect's 3D model of a house. We want to show the 3D model to get feedback. No work has been done on the actual house, and it might take a long time to build."
This type of analogy may seem silly, I understand that. But I would rather be safe than be sorry.</p>
<p>The situation gets more difficult when it comes to external customers.
They may not be familiar with how long it can take to fully develop a feature or product,
and get it ready for launching.
They might think we have more people working on the project than we actually have.
In our enthusiasm to show what we are working on we might end up turning the customer against us.
It will look like the project is not working out if there are no regular updates after a demo.
Enthusiasm will go down over time even if there are regular updates. </p>
<p>I think there should only be a project demonstration if we can give ourselves
a hard deadline that we will make no matter what happens.
That means the project needs to be at a certain stage of maturity before we demo it. It also means that
we need to be able to add more people, or the right experienced people, when the project has setbacks.
We need to be certain we actually want to do the project, and that we are able to.
Once we have demonstrated the future product to customers, it is bad form to cancel it.</p>Some files and information should not be in source control2020-10-23T00:00:00+02:002020-10-23T00:00:00+02:00Taco Steemerstag:tacosteemers.com,2020-10-23:/articles/2020-10-23-some-files-and-information-should-not-be-in-source-control.html<p>Which are they, what should we do with them instead, and how can we avoid mistakes?</p><p>Some files and information should not be stored in version control systems such as git.
Which are they, what should we do with them instead, and how can we avoid mistakes?</p>
<div class="toc"><span class="toctitle">Table of contents:</span><ul>
<li><a href="#secrets">Secrets</a></li>
<li><a href="#secrets-in-practice">Secrets, in practice</a></li>
<li><a href="#generated-files">Generated files</a></li>
<li><a href="#an-exception-generated-interfaces">An exception, generated interfaces</a></li>
<li><a href="#other-files">Other files</a></li>
<li><a href="#what-about-backups">What about backups?</a></li>
<li><a href="#what-about-documentation">What about documentation?</a></li>
<li><a href="#how-to-avoid-adding-secrets-to-version-control">How to avoid adding secrets to version control</a></li>
<li><a href="#how-to-avoid-adding-unwanted-files-to-git">How to avoid adding unwanted files to Git</a></li>
</ul>
</div>
<h2 id="secrets">Secrets</h2>
<p>Examples of files that do not belong in a version control system are
(unencrypted) files containing API credentials, keys, and anything else that is supposed to
stay secret.</p>
<p>Over the course of a project's lifetime many people might get access to the source code.
This makes the source code an unsafe place to keep secrets.</p>
<p>Another aspect is that some secrets change, such as credentials.
This is easier to do if the secrets are stored separate from the source code.
If we had them in a distributed source control system it would be more work
to change them for all active versions of the software.
It would also require a new release and deploy.</p>
<p>Loose secrets and files that contain secrets are usually made available to the applications
through environment variables, or a shared data source such as a secret store.
The environment variables can be set by the software responsible for deploying,
starting and stopping the applications.
An <a href="https://www.vaultproject.io/use-cases/secrets-management">example of a secret store is Vault</a>.</p>
<h2 id="secrets-in-practice">Secrets, in practice</h2>
<p>In practice, we are likely to find that both methods are used to make secrets available.
The URI of the secret store could be stored in the application database.
To get that information we need to connect to the application database first.
We can't connect to a database to get the database password from the database.
Thus, the database connection details may be passed through environment variables,
or the deployment software may write them to a file on the application server. </p>
<h2 id="generated-files">Generated files</h2>
<p>Files that are the result of build steps, such as output from
generators and compilers, should not be added to a source control
management system. These are not source files.
Any changes made to them will be overwritten the next time the build is run.
Another example is files created by runtime environments, such as the <code>__pycache__</code>
directory which is created when a Python program is run.</p>
<h2 id="an-exception-generated-interfaces">An exception, generated interfaces</h2>
<p>As far as I know there is only one type of exception to the rule.
Generating a <a href="https://en.wikipedia.org/wiki/SOAP">SOAP interface</a> from a <a href="https://en.wikipedia.org/wiki/SOAP">local WSDL</a>
file during every build is a waste of resources. It can be an acceptable solution to generate it once and add the output
to the project source files. An alternative to adding the output to version control is to package it as an artifact
(dependency) and add it to the organization's private artifact repository. </p>
<h2 id="other-files">Other files</h2>
<p>An example of other more mundane files is the .DS_Store file.
This is a MacOS file for storing details of how a directory needs to be shown on the desktop.
It is unrelated to the project.</p>
<p>IDE files such as <code>IML</code> files and <code>.idea</code> directories should also not be added. These contain
the developer's personal settings and preferences.
Occasionally we may share them to help a new colleague get up and running, but it is not
part of the project source code.</p>
<h2 id="what-about-backups">What about backups?</h2>
<p>There is no need to store several versions of the file next to each other in the project
directory.
The version control system controls file versioning.
The previous version of the file is the backup.</p>
<p>Database backups don't belong in the source control system.
They belong on a properly secured storage server.</p>
<h2 id="what-about-documentation">What about documentation?</h2>
<p>Personally I feel that some level of documentation can be good to add.
This includes instructions about development dependencies, local development setup,
and documents concerning integration with external APIs.
Having this type of information close at hand can be very helpful to developers.</p>
<h1 id="how-to-avoid-adding-secrets-to-version-control">How to avoid adding secrets to version control</h1>
<p>This is a problem that probably does not have a full technical solution.
Awareness is key. </p>
<p>There are projects such as <a href="https://github.com/awslabs/git-secrets">git-secrets</a> that
try to solve this. Personally I have not used git-secrets or similar tools.
Secrets detection is tricky to automate and won't be fool-proof.
I imagine that they can detect secrets that they already know; common secrets such as
AWS related credentials. Secrets specific to your software on the other hand, I expect
to be difficult to detect. The creators of the tool are not familiar with them.</p>
<h1 id="how-to-avoid-adding-unwanted-files-to-git">How to avoid adding unwanted files to Git</h1>
<p>Git has a special file, the <a href="https://git-scm.com/docs/gitignore">.gitignore</a> file.
This can be used to specify a list of files that should not be added to the source control
system. The file itself is always added to the source control system, that way every
developer can benefit from it.</p>
<p>This file is easy to create.
Here is an example <code>.gitignore</code> file for a website project generated with Pelican.
The developers are using a MacOS computer and the IntelliJ IDEA development environment.</p>
<div class="highlight"><pre><span></span><code><span class="na">.DS_Store</span>
<span class="err">*</span><span class="na">.iml</span>
<span class="na">.idea</span>
<span class="nf">generated</span><span class="err">/</span>
<span class="nf">pelican</span><span class="err">/</span><span class="no">output</span><span class="err">/</span>
<span class="nf">pelican</span><span class="err">/</span><span class="no">__pycache__</span><span class="err">/</span>
</code></pre></div>
<p>As we can see, we ask Git to ignore the MacOS-specific file, the IDEA specific files and
directories, and the output directories.</p>Prefer to create constructive or uplifting conversations at work2020-10-18T00:00:00+02:002020-10-18T00:00:00+02:00Taco Steemerstag:tacosteemers.com,2020-10-18:/articles/2020-10-18-prefer-to-create-constructive-or-uplifting-conversations-at-work.html<p>It is healthy to discuss negative parts of situations. When we are part of such a discussion, it is good if we can turn it in to an uplifting or constructive conversation.</p><p>It is natural for people to focus on negative parts of
situations.
We experience something we don't like and want to discuss it.
This can be healthy. We are letting off some steam, as it were.</p>
<p>Occasionally we might be actively participating
in a discussion with a negative tone.
If we don't take action or give actionable advice then
all we are doing is complaining.
Complaining is not helpful if we do it too often.
On top of that, the situations we are bringing up for
discussion are probably not new to our colleagues.
They know about it, they know it is not optimal. They haven't
had the energy or drive yet to fix it.
If we are not bringing something constructive or positive to
the discussion we will end up taking more of their time and
energy. </p>
<p>It has become clear to me that this type of conversation tends
not to become a constructive conversation,
unless we make a conscious effort.
If we let the conversation flow naturally it is rare for this
kind of conversation to become one where solutions are offered
and follow-up actions are defined.</p>
<p>When this kind of conversations come up with colleagues,
we have three good options.
Option one is to let people just get it out, but keep it short.
If the topic continues, we can go to options two and three.
Option two is to share actionable advice now or even offer to
solve the problem, if we can do so and the person is open to
it. If we can't, option three is to offer to schedule a meeting
with people who might.
At this point the people we are talking with will indicate
whether there really is a problem that needs solving, or they
were just letting off some steam.</p>
<p>Sometimes a complaint comes up that we just can't really do
anything with. This can cost us a lot of energy.
The best way to handle this type of conversation may be to
acknowledge the complaint, but add something that turns it in
to an uplifting conversation.
Preferably there is something happy to be said related to
the complaint. If the complaint goes on too long, and we can't
think of anything relevant and useful to say, we can always
transition the conversation to the good weather, sports,
or whatever happened in a famous television show. </p>Dark and light web themes: consider using a hybrid CSS/JS implementation2020-10-17T00:00:00+02:002020-10-17T00:00:00+02:00Taco Steemerstag:tacosteemers.com,2020-10-17:/articles/2020-10-17-dark-and-light-web-themes-consider-using-a-hybrid-approach.html<p>Instead of using either CSS media queries for operating system theme preferences or a Javacript-based theme selector we can use both. An automatic CSS-based switching and JS-based switching where the user can choose.</p><p>An excellent article on website dark mode and light mode implementation can be found <a href="https://css-tricks.com/a-complete-guide-to-dark-mode-on-the-web/">here on ccs-tricks</a>.
It describes a style sheet-based implementation and a Javascript-based implementation.
The style sheet-based implementation uses the user's operating system preferences to automatically select a dark or light theme.
The Javascript-based implementation allows a user to select
the theme they want to use.</p>
<p>Adding to that article, I want to advocate for a hybrid approach where
we use both. An automatic CSS-based switching and JS-based switching
where the user can choose.</p>
<h2 id="the-advantage">The advantage</h2>
<p>The advantage is we can add theme selection, and the default theme can
be the preferred theme as configured on the operating system level.
This way we have all options open for users that browse with JS on.
Users that browse with JS disabled will still get the style that they
have selected as the preferred style in their operating system
preferences.</p>
<h2 id="the-themeswitcher-control">The themeswitcher control</h2>
<p>The tutorial linked to at the beginning of this article shows an
example implementation.
The tutorial code removes CSS classes and adds CSS classes
when the user switches to a different theme. </p>
<p>My own implementation changes the entire stylesheet.
This is done by changing the stylesheet href.</p>
<p>The stylesheet is linked as follows:
<code><link rel="stylesheet" id="css_colors" href="/css/colors/1.css" /></code></p>
<p>It can be changed with Javascript in the following way:
<code>document.getElementById("css_colors").href="/css/colors/2.css";</code></p>
<p>The manual themeswitcher on this website is currently a big dropdown
control. That is not necessary.
It can be a button, an anchor link or a simple icon.</p>
<h2 id="the-downside-to-my-hybrid-approach">The downside to my hybrid approach</h2>
<p>The downside to my hybrid approach is that there is some duplication.
For the CSS-only functionality we need a stylesheet that uses
switching based on media queries.
For the JS-based functionality we need to be able to load a stylesheet
specific to the chosen theme.
The contents of these stylesheets would partially overlap.
As a proponent of the 'do not repeat yourself' idea this is a downside.
Duplicated source code makes it easy to create inconsistencies. </p>
<h2 id="solution-to-the-downside">Solution to the downside</h2>
<p>To avoid having to write duplicate stylesheet contents we can generate
the required stylesheets from de-duplicated input files.
My scripts for generating the css <a href="https://gitlab.com/taco.steemers/generate_os_preferred_theme_switching">can be found here</a>.</p>
<p>Let's take a look at how this would work.</p>
<p>The three color stylesheets can be generated from three input files:</p>
<ol>
<li>A file containing the CSS rules that apply the color variables</li>
<li>A file containing the light mode color variables</li>
<li>A file containing the dark mode color variables.</li>
</ol>
<p>We might have four stylesheet files for the website:</p>
<ul>
<li>The general stylesheet that does not contain color information</li>
<li>The general color stylesheet that contains both light and dark mode color information, for operating system preference support.
This is input files 1, 2 and 3 combined in a specific structure.</li>
<li>The light mode stylesheet for the javascript switching support.
This is input file 1.</li>
<li>The dark mode stylesheet for the javascript switching support.
This is input file 2.</li>
</ul>
<p>Based on this idea, the general color stylesheet would look like this:</p>
<div class="highlight"><pre><span></span><code><span class="cm">/* This section is identical to the light mode CSS file contents */</span><span class="w"></span>
<span class="err">:</span><span class="n">root</span><span class="w"> </span><span class="err">{</span><span class="w"></span>
<span class="w"> </span><span class="n">color</span><span class="o">-</span><span class="nl">scheme</span><span class="p">:</span><span class="w"> </span><span class="n">light</span><span class="w"> </span><span class="n">dark</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="c1">--code-text-color: black;</span>
<span class="w"> </span><span class="c1">--background-color: white;</span>
<span class="err">}</span><span class="w"></span>
<span class="nv">@media</span><span class="w"> </span><span class="p">(</span><span class="n">prefers</span><span class="o">-</span><span class="n">color</span><span class="o">-</span><span class="nl">scheme</span><span class="p">:</span><span class="w"> </span><span class="n">dark</span><span class="p">)</span><span class="w"> </span><span class="err">{</span><span class="w"></span>
<span class="w"> </span><span class="cm">/* This section is identical to the dark mode CSS file contents */</span><span class="w"></span>
<span class="w"> </span><span class="err">:</span><span class="n">root</span><span class="w"> </span><span class="err">{</span><span class="w"></span>
<span class="w"> </span><span class="n">color</span><span class="o">-</span><span class="nl">scheme</span><span class="p">:</span><span class="w"> </span><span class="n">light</span><span class="w"> </span><span class="n">dark</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="c1">--text-color: white;</span>
<span class="w"> </span><span class="c1">--background-color: black;</span>
<span class="w"> </span><span class="err">}</span><span class="w"></span>
<span class="err">}</span><span class="w"></span>
<span class="cm">/* This section contains the rules for applying the color variables */</span><span class="w"></span>
<span class="n">body</span><span class="w"> </span><span class="err">{</span><span class="w"></span>
<span class="w"> </span><span class="nl">color</span><span class="p">:</span><span class="w"> </span><span class="nf">var</span><span class="p">(</span><span class="c1">--text-color);</span>
<span class="w"> </span><span class="n">background</span><span class="o">-</span><span class="nl">color</span><span class="p">:</span><span class="w"> </span><span class="nf">var</span><span class="p">(</span><span class="c1">--background-color);</span>
<span class="err">}</span><span class="w"></span>
</code></pre></div>
<h2 id="conclusion">Conclusion</h2>
<p>If we only support the automatic theme selection a user might be forced
to use a theme that they don't want to use.
The user might prefer their operating system controls in a dark theme,
but that does not guarantee that they want to see our website in our
website's dark theme.
For that reason I prefer having the automatic system as well as
giving the user the option to choose a theme.
It takes a bit more work to support both,
but it helps make our websites more accessible.</p>Usability anti-pattern: no controls in fullscreen2020-10-16T00:00:00+02:002020-10-16T00:00:00+02:00Taco Steemerstag:tacosteemers.com,2020-10-16:/articles/2020-10-16-ux-anti-pattern-no-controls-in-fullscreen.html<p>A fullscreen window needs to have an obvious button to get out of fullscreen mode.</p><h2 id="general-observations-on-the-zoom-user-interface">General observations on the Zoom user interface</h2>
<p>The Zoom video meeting desktop application has different windows and
types of controls.
It regularly uses three main windows.
These windows have overlays, drop down menus, and regular buttons.</p>
<p>Some buttons move around between different windows, based on the
currently active features.
For example, the mute / un-mute button moves around based on whether
screen sharing is active on your machine.</p>
<p>Some labels look like buttons but are in fact not buttons.</p>
<p>There is a taskbar on the bottom of one of the main windows that contains
buttons.
Other buttons and labels can be found on the top edge,
or above the bottom taskbar.</p>
<p>There is also one main windows that has a right-click menu.</p>
<p>There are other separate windows that pop up when clicking controls
on the main windows, such as the chat window.</p>
<p>As a whole it is messy. Some of the user interface choices are
perhaps understandable due to the functionality in the application.</p>
<h2 id="fullscreen-mode">Fullscreen mode</h2>
<p>The least user-friendly part of the application can be seen during
fullscreen mode.
When a participant shares a screen it is immediately shown fullscreen
to the other people. </p>
<p>During some meetings the shared screen window has a menu that occurs when
the mouse comes close to the top edge. It can be used to exit fullscreen.</p>
<p>Unfortunately there appears to be a different type of screen sharing as
well. During other meetings the fullscreen shared window does not have
any button to turn the fullscreen window in to a regular window.
There are no controls at all.
Moving the mouse or clicking the screen does not make any controls
appear. The escape key does not work.
In a sense, Zoom locks the user out of their computer.
It turns out that double-clicking the video on
the shared window exits the full-screen mode.
The double-clicking is not obvious.
It is the only double-clicking in the application.</p>
<h2 id="conclusion">Conclusion</h2>
<p>It should be easy to escape fullscreen mode.
Especially in the case of Zoom screen sharing.
The user did not choose to enter fullscreen mode.
Zoom did that by itself.
This is confusing to the user.</p>Thoughts on self-managing teams2020-10-08T00:00:00+02:002020-10-08T00:00:00+02:00Taco Steemerstag:tacosteemers.com,2020-10-08:/articles/2020-10-08-thoughts_on_self-managing_teams.html<p>Self-managing teams are great. People feel their shared responsibility and come together to solve problems. There should be someone that is authorized to make decisions on specific topics. And the topic 'people' should not be forgotten.</p><p>A self-managing team is a team without supervisory managers.
Everyone who works toward the mission is an active member of the team.
This kind of team can also be called a flat team.
I find there to be many upsides to being part of a flat team.
These mainly come from the fact that it allows the subject matter
experts a great deal of freedom to decide how the work will be done.</p>
<h2 id="mutual-decisions-based-on-shared-responsibility">Mutual decisions based on shared responsibility</h2>
<p>It seems widely accepted that a software product development team
needs a product owner.
Someone who is a kind of customer representative, and keeps track of
progress towards the long-term vision of the product and the business.
Another widely accepted role, either implicit or explicit, is the
technical lead, or technical product owner.
The one person who can be relied
upon to make the final decision on technical matters.
We may not always agree with their decisions, but we respect the
decisions. Because they are part of the team they share our lived
experiences and should be taking them in to account.</p>
<p>Should we have any other 'special' colleagues apart from these two
roles? Perhaps not. We take away shared responsibility every time we
give a specific person a specific title. For example, people feel a bit
less responsible for sticking to the product vision if they are not the
product owner.</p>
<h2 id="downsides">Downsides</h2>
<p>There can be downsides to a flat team too.
In thinking about software teams and product teams we tend to forget
that these are made out of people. The people in the team have a big
impact on the success of the team's mission.
People also need tending to. In a hierarchical team this responsibility
would fall to the supervisory manager.
In a flat hierarchy we can get in to uncomfortable situations if teams
don't have a specific person to discuss people problems with. There is
nobody to go to for an outside perspective if there seems to be a
problem in the team. There is nobody who is in a position to make a
decision.
If this person is part of the team they might become biased.
If they are outside of the team they won't know what is going on.</p>
<p>This principle can also apply to other topics, such as budgets and spending.</p>
<h2 id="outside-influences">Outside influences</h2>
<p>The average business consists of more parts than just the one flat
team. There are sales teams, account management teams, customer support teams,
and others. A common outside influence on a software development team is when
the sales team or account management team makes a promise to a customer that
is up to the software development team to fulfill.
It is not acceptable if someone else can make a decision about what the
team can work on.
The team is not a self-managing team if someone else can make a decision
without involving the team.
In that case the team is at best just a normal team. It doesn't have the power
to say yes or no. At worst, the team can become stuck between conflicting
decisions by others. This will reduce the team's impact and quality of work.</p>
<p>There is nobody to protect the team because there is no officially appointed
team manager. The self-managing team must protect itself and clarify
that any commitment that involves the team must be pre-approved by the team.</p>
<h2 id="conclusion">Conclusion</h2>
<p>Flat teams are great. Generally people feel their shared responsibility and
really come together to solve problems.
However, even in a flat team there should be someone that is authorized to
make decisions on specific topics. And the topic 'people' should not be forgotten.</p>Git: how can we squash (flatten) commits2020-09-27T00:00:00+02:002020-09-27T00:00:00+02:00Taco Steemerstag:tacosteemers.com,2020-09-27:/articles/2020-09-27-git-how-can-we-squash-flatten-commits.html<p>In this article I explain two ways in which we can squash commits.</p><p><a href="https://tacosteemers.com/articles/2020-07-10-when-(not)-to-squash-commits.html">Earlier I wrote about when (not) to squash commits</a>. To squash commits means to flatten, or combine, a series of consecutive commits in to one commit.
Here I explain two ways we can combine commits.</p>
<h1 id="squashing-consecutive-commits-on-your-feature-branch">Squashing consecutive commits on your feature branch</h1>
<p>The first way is interactive rebasing.
With this tool we can choose to combine several commits that only exist on this branch in to one commit.
In my opinion, interactive rebasing is only a good idea in these two situations:</p>
<ul>
<li>you rebase right before merging the feature branch into the main branch, after this branch has been reviewed,
and you know no-one has branched off of this branch</li>
<li>you know you are the only one working on this branch, and as a result there is no possibility that you will rebase
commits that someone else is basing their own commits on</li>
</ul>
<p>The basic example is as follows. We have finalized part of our work on this feature branch. This part is spread out over
5 commits. We would like to flatten this in to one commit for this part of the work, rather than keeping the 5 separate
intermediate commits. This can be done with the following command:</p>
<div class="highlight"><pre><span></span><code><span class="err">git rebase -i HEAD~5</span>
</code></pre></div>
<p>Note that this sign before the 5 is the tilde, ~, not the minus sign.
This command will fail if there are only 5 commits on the branch.</p>
<p>We will be presented with the text editor. It shows a list of commits, ordered from the oldest on top to the newest at
the bottom. Here we will indicate which commits we want to <code>pick</code> (use as a basis to flatten on) and which commits we
want to <code>squash</code> away. The text editor window will contain an explanation of all possible commands. </p>
<p>We will pick the first and squash the rest.
We don't need to type out the full <code>squash</code>.
Just replace <code>pick</code> with <code>s</code>.</p>
<p>After we save the file and exit the editor we will receive instructions on how to proceed.</p>
<h1 id="how-does-this-look-in-a-bigger-process">How does this look in a bigger process?</h1>
<p>Let's look at the situation where we are manually merging a feature branch into
the main branch. Our feature branch has received 5 commits since it was branched off of the main branch. These commits
have passed the review phase. Now we want to combine these commits in to one commit and add that to the main branch.
This one commit will be easier to backport than several commits, and easier to understand for future review than 5
separate commits would be.</p>
<p>First we update the main branch:</p>
<div class="highlight"><pre><span></span><code><span class="err">git checkout main</span>
<span class="err">git pull</span>
</code></pre></div>
<p>Then we squash our 5 commits on the feature branch:</p>
<div class="highlight"><pre><span></span><code><span class="err">git checkout feature</span>
<span class="err">git rebase -i HEAD~5</span>
</code></pre></div>
<p>The next step is where we rebase our feature branch with the current state of the main branch. In other words, we put the current state of the main branch below the additional commits that we added to the feature branch.
It is possible that there are conflicts that we need to resolve by hand.
We don't commit the changes that follow from the conflicts. We only stage the changed files.</p>
<div class="highlight"><pre><span></span><code><span class="err">git rebase main</span>
</code></pre></div>
<p>If we are resolving conflicts, we can get a hint on how to proceed from git, by asking for the status.</p>
<div class="highlight"><pre><span></span><code><span class="err">git status</span>
</code></pre></div>
<p>When we are done resolving conflicts, we finalize the rebase:</p>
<div class="highlight"><pre><span></span><code><span class="err">git rebase --continue</span>
</code></pre></div>
<p>If we feel that something is going wrong, we can abort the rebase:</p>
<div class="highlight"><pre><span></span><code><span class="err">git rebase --abort</span>
</code></pre></div>
<p>After finalizing the rebase we should perform our testing again.
Once we are satisfied that everything is as it should be,
we can merge the feature branch to the main branch.</p>
<h1 id="squash-everything-while-merging-to-the-main-branch">Squash everything, while merging to the main branch</h1>
<p>A second way we can combine commits is the actual <code>git merge --squash</code> command.</p>
<p>For conversation about this workflow: <a href="https://stackoverflow.com/questions/5308816/how-to-use-git-merge-squash">https://stackoverflow.com/questions/5308816/how-to-use-git-merge-squash</a></p>
<p>Personally I don't use this workflow by hand, as it is automated away where I work.</p>
<p>First we update the main branch.</p>
<div class="highlight"><pre><span></span><code><span class="err">git checkout main</span>
<span class="err">git pull</span>
</code></pre></div>
<p>Then we rebase the feature branch with the current state of the main branch, and resolve conflicts by hand.</p>
<div class="highlight"><pre><span></span><code><span class="err">git checkout feature</span>
<span class="err">git pull</span>
<span class="err">git rebase main</span>
</code></pre></div>
<p>After we finish rebasing and resolving conflicts we should perform our testing again. </p>
<p>The next step is to check out the main branch and merge --squash the feature branch onto it:</p>
<div class="highlight"><pre><span></span><code><span class="err">git checkout main</span>
<span class="err">git merge --squash feature</span>
</code></pre></div>
<p>Finally, the changes need to be staged and committed with <code>git commit</code>.
You might want to write a new message for this, using <code>git commit -m</code>.</p>The Twitter share button does not need javascript2020-08-30T00:00:00+02:002020-08-30T00:00:00+02:00Taco Steemerstag:tacosteemers.com,2020-08-30:/articles/2020-08-30-twitter-share-button-does-not-need-javascript.html<p>When we let Twitter generate a 'tweet this button' for our website,
Twitter includes a javascript file.
We don't need to include this javascript file.
The share button can be just a hyperlink.
Twitter uses the referrer header to determine which URL the user wants to tweet about.</p>
<p>The javascript …</p><p>When we let Twitter generate a 'tweet this button' for our website,
Twitter includes a javascript file.
We don't need to include this javascript file.
The share button can be just a hyperlink.
Twitter uses the referrer header to determine which URL the user wants to tweet about.</p>
<p>The javascript does not benefit the reader. It might include tracking features that the reader does not like.</p>
<p>Twitter generates the following HTML:</p>
<div class="highlight"><pre><span></span><code><span class="nt"><a</span> <span class="na">href=</span><span class="s">"https://twitter.com/share"</span>
<span class="na">class=</span><span class="s">"twitter-share-button"</span>
<span class="na">data-via=</span><span class="s">"</span><span class="cp">{{</span><span class="nv">TWITTER_USERNAME</span><span class="cp">}}</span><span class="s">"</span>
<span class="na">data-text=</span><span class="s">"</span><span class="cp">{{</span> <span class="nv">article.title</span> <span class="cp">}}</span><span class="s">"</span>
<span class="na">data-dnt=</span><span class="s">"true"</span> <span class="na">data-show-count=</span><span class="s">"false"</span>
<span class="na">data-count=</span><span class="s">"horizontal"</span><span class="nt">></span>Tweet about this article<span class="nt"></a></span>
<span class="nt"><script</span> <span class="err">async</span> <span class="na">src=</span><span class="s">"https://platform.twitter.com/widgets.js"</span> <span class="na">charset=</span><span class="s">"utf-8"</span><span class="nt">></script></span>
</code></pre></div>
<p>We only need the a-tag.
The Twitter-specific anchor attributes become unused after removing the javascript.
We can simplify the anchor to the following:</p>
<div class="highlight"><pre><span></span><code><span class="nt"><a</span> <span class="na">href=</span><span class="s">"https://twitter.com/share?url=url_of_this_page"</span><span class="nt">></span>Tweet about this article<span class="nt"></a></span>
</code></pre></div>
<p>In Pelican, the URL would be built up as follows:</p>
<div class="highlight"><pre><span></span><code><span class="x">https://twitter.com/share?url=</span><span class="cp">{{</span> <span class="nv">SITEURL</span> <span class="cp">}}</span><span class="x">/</span><span class="cp">{{</span> <span class="nv">output_file</span> <span class="cp">}}</span><span class="x"></span>
</code></pre></div>
<p>If we want to have an icon then we can do that with an image.
To get the image file you can copy one from Twitter, or any other share button such as the one on this website.</p>
<div class="highlight"><pre><span></span><code><span class="nt"><div></span>
<span class="nt"><img</span> <span class="na">class=</span><span class="s">"icon"</span> <span class="na">src=</span><span class="s">"/</span><span class="cp">{{</span> <span class="nv">THEME_STATIC_DIR</span> <span class="cp">}}</span><span class="s">/images/twitter.png"</span> <span class="na">alt=</span><span class="s">"Twitter"</span><span class="nt">/></span>
<span class="nt"><a</span> <span class="na">href=</span><span class="s">"https://twitter.com/share"</span><span class="nt">></span>Tweet about this article<span class="nt"></a></span>
<span class="nt"></div></span>
</code></pre></div>
<p>The image would preferably be placed on a location that is yours. The src attribute needs to refer to the exact location of the image.</p>Git pull without merge2020-08-18T00:00:00+02:002020-08-18T00:00:00+02:00Taco Steemerstag:tacosteemers.com,2020-08-18:/articles/2020-08-18-git-pull-without-merge.html<p>Applying remote changes to our local branch without an additional merge commit</p><h1 id="the-problem">The problem</h1>
<p>I want to update my local git branch based on the changes that are on the remote branch.
If I just do <code>git pull</code>, I will need to do a merge. I don't want to get an additional commit just for merging.
Instead, I want my local changes to be applied on top of the changes that have been made to the remote branch.</p>
<h1 id="a-solution">A solution</h1>
<p><code>git pull --rebase</code></p>
<p>There will be no additional commit for merging.
Instead, the changes required to merge your changes with the remote branch will be made to your local commits.
Now it looks like the branch has always been based on the current state of the remote branch.</p>
<h1 id="background">Background</h1>
<p>This is what we call <em>rebasing</em>. One might say that the state we <em>base</em> our changes on is <em>re</em>-done.</p>
<p>There is also interactive rebasing, but that is a different topic.</p>
<p>We should not do <code>git pull --rebase</code> on the same main branch that other people are pushing work in progress to as well.
That will get messy really fast.
One proper way to handle that type of situation is to use a <a href="https://www.atlassian.com/git/tutorials/comparing-workflows/feature-branch-workflow">feature branch workflow</a>. </p>When (not) to squash commits2020-07-10T00:00:00+02:002020-07-10T00:00:00+02:00Taco Steemerstag:tacosteemers.com,2020-07-10:/articles/2020-07-10-when-(not)-to-squash-commits.html<p>In this article I want to talk about squashing (flattening) a series of commits in to one commit. Squashing is a good tool to have, but not everything should be squashed away. Intermediate commit history can have value.</p><p><em>Summary: Squashing is a good tool to have, but not everything should be squashed away. Intermediate commit history can have value. Please don't squash rework on a PR during the review phase. If you do, the reviewers may have to start over from scratch.</em></p>
<p>In this article I want to talk about when it is appropriate to be <a href="https://tacosteemers.com/articles/2020-09-27-git-how-can-we-squash-flatten-commits.html">squashing (flattening) a series of commits in to one commit</a>. I assume that we are using a branch for each issue (such as a bug or a feature), and we merge that branch to the main branch when it is considered finished, tested and properly reviewed.</p>
<h2 id="why-do-we-squash-during-development">Why do we squash during development?</h2>
<p>In my experience it is good practice to commit often. It allows us to easily revisit or roll back any specific step that we take towards the final version of our code. This makes the process of finding the desired solution easier. When it comes time to share our work, we might want to squash our commits. </p>
<p>After we decide that we have found a sufficient solution for a specific problem, the road we took may not seem interesting anymore. At that point we are only interested in having solved the issue at hand. We might then squash a series of commits that we made while working towards this solution, and move on to the next problem.</p>
<p>There is another reason one might squash their commits. When we are writing code we make that code permanent by committing the code locally and eventually pushing it out to a central server, after which our code will be available to everyone. Sometimes we feel kind of bad about our earlier commits and don't want to make them permanently visible to others. Perhaps they were failed attempts and would show to others how little we understood when we started to work on this issue. This is also a valid reason to squash, and I don't want to minimize these kinds of feelings.</p>
<p>Whatever the reason, we decide to remove the earlier commits from existence such that only the final result remains in one neat commit.</p>
<h2 id="squashing-is-a-good-tool-to-have">Squashing is a good tool to have</h2>
<p>Having a series of commits squashed in to one can be very handy when that commit eventually ends up on the main branch; with each issue contained to one commit it is a lot easier to research changes than when changes related to single issues are spread out over ten or a hundred commits. For that reason, I always squash before I merge an approved pull request.</p>
<p>Another reason is that it makes <a href="https://git-scm.com/docs/git-cherry-pick">cherry-picking</a> less error-prone and less time-consuming. </p>
<p>Squashing is great.
However, there are also situations where I think we should not squash commits.</p>
<h2 id="not-everything-should-be-squashed-away">Not everything should be squashed away</h2>
<p>Some pull requests contain a large amount of changes done in several steps that are clearly separate topics in the minds of the developers. In these cases it can be beneficial to do these steps one by one and squash the commits for one step in to one commit before proceeding to the next step. This will make the end result easier to review because it is still separated in understandable steps. It also makes it easier to get review comments on intermediate work; this can be important if the requested changes have impact on the next steps.</p>
<p>A situation where one should not squash at all is when one is committing rework in response to changes requested by reviewers. The reviewers are not as focused on your work as you are. They did the review and went on to another activity. By the time they get back to your pull request to see what has changed since the last time they looked at it, the reviewers will not remember the exact state of your branch before you did rework and squashed the branch again. The reviewers probably can't be certain if your rework has even improved the branch. Your reviewers will have to redo the entire review as a result of you squashing the rework on top of your original work. By the time they are half-way through re-reviewing your work they might be quite tired and have difficulty keeping focused on giving the pull-request a high-quality review.</p>
<p>We can save our colleagues as well as ourselves some time by taking a moment to consider before we squash.
Our reviewers will be grateful!</p>Please use root relative URLs2020-06-21T00:00:00+02:002020-06-21T00:00:00+02:00Taco Steemerstag:tacosteemers.com,2020-06-21:/articles/2020-06-21-please-use-root-relative-urls.html<p>Resources with URLs that do not specify the root will not be loaded if the user visits a subpage directly.</p><p>Recently I fixed a bug in one of my employer's web applications, that
would only sometimes occur, and only when using Internet Explorer 11.
It had to do with application behavior that requires JavaScript.</p>
<p>Our mind may race to some obscure JavaScript Internet Explorer
facts like <a href="https://developer.mozilla.org/en-US/docs/Web/API/URL/URL#Browser_compatibility">the missing URL object constructor</a>
or that console.log() will fail if the console is not open.
The actual cause was much more mundane.</p>
<p>The URL to the JavaScript relating to this feature was not
relative to the root. It was missing the initial forward slash, <code>/</code>.
As a result it refers to a file at the same node as the file
the reference occurs in.
Resources with URLs that do not specify the root will not be loaded if the user visits a subpage directly.
An easy mistake to make, but less easy to discover due to the
modern web application and browser landscape.</p>
<p>Some modern web applications consist of a single page. This type
of application is referred to as a <a href="https://en.wikipedia.org/wiki/Single-page_application">Single Page Application, SPA</a>.
All JavaScript is loaded through that one page.
Other applications may have a single page where the user lands, the main page of the application.
In this situation the main page is likely to load most or all of the JavaScript for the application.
In both cases, the JavaScript will load correctly even if the <code>/</code> has been omitted,
as the URL for the page will be equal to the web application root.</p>
<p>If we were to use a nested URL directly, without visiting the main page, it would go wrong.
The reference to the JavaScript file will be parsed as relative to the nested URL rather than to the root.</p>
<p>In a stand-alone web application this may never be a problem.
However, many modern applications integrate with other applications.
Software vendors may provide one main application and several other linked applications for added value.
Even a small business is likely to use a series of interconnected tools.</p>
<p>Please always use root relative URLs to avoid this type of issue.
To make this more easy to do we can stop putting slashes at the start and end of variables,
and always put them between the variable names.
This practice also makes the code more easy to read because we never have to asks ourselves if
the slashes are correct; we see them immediately.</p>
<p>The surprising part is that the failure situation does not occur in other browsers because they retrieve the correct
file from the cache even though technically a wrong file is referenced.
In my opinion Internet Explorer 11 shows the correct behavior.
The other browsers make it impossible to reliably use the same JS file name in nested directories.</p>
<p>Due to the fact that Microsoft <a href="https://docs.microsoft.com/en-us/lifecycle/faq/internet-explorer-microsoft-edge">product support dates are linked to Operating System support dates</a>, Internet Explorer 11 will be supported
<a href="https://support.microsoft.com/en-us/lifecycle/search?alpha=windows%2010">until September 2029 in some situations</a>.
An example is the Windows 10 Enterprise 2019 Long Term Servicing Channel.</p>Creating a new website without breaking search engine results and bookmarks2020-06-14T00:00:00+02:002020-06-14T00:00:00+02:00Taco Steemerstag:tacosteemers.com,2020-06-14:/articles/2020-06-14-creating-a-new-website-without-breaking-search-engine-results-and-bookmarks.html<p>When moving our website to a different CMS or framework we might end up with a different URL structure. This is bad for search engine rankings and people's bookmarks. To avoid problems we can use rewrite rules and redirect rules.</p><p>When moving our website to a different CMS or framework we might end up with a different <a href="https://en.wikipedia.org/wiki/URL">URL</a> (web address) structure.
This is bad for search engine rankings and people's bookmarks.
To avoid problems we can use rewrite rules and redirect rules.</p>
<p>An acquaintance asked for help with a <a href="https://en.wikipedia.org/wiki/URL_redirection">redirect rule</a> on LinkedIn.
They explained that their new website design used different software
that placed their blog articles in a different directory, resulting in a different URL.
The change in URLs was not acceptable to them because they did not want to lose the SEO profile (search engine rankings)
for their website.
If a search engine cannot find your pages anymore, your pages lose their ranking in the search results.
They thought they needed a redirect rule to redirect the search engine bots from the old URL pattern,
<code>/YYYY/MM/slug</code>, to the new pattern, <code>/posts/YYYY/MM/slug</code>.
On <a href="https://www.netlify.com/">Netlify</a>, their previous host, they could add <code>/:year/:month/:slug /posts/:year/:month/:slug</code>
to their <a href="https://docs.netlify.com/routing/redirects/#syntax-for-the-redirects-file">_redirects-file</a>
to achieve the redirect effect.
Their new host uses the Apache web server, a very common choice.
They wondered how they could achieve the same effect with Apache.
The Apache documentation has <a href="https://httpd.apache.org/docs/2.4/rewrite/remapping.html">a great page on this topic</a>.</p>
<img alt="Sequence diagram for the redirect model" class="uml" src="/images/001f6f66.png"><p>It is clear what they want to achieve, and it can be done with <a href="https://httpd.apache.org/docs/2.4/mod/mod_alias.html#redirectmatch">the RedirectMatch directive</a>, but they do not need to use a redirect.
A redirect requires clients and search engine bots to actually visit the old address before they are redirected to
the new address. That takes a small amount of time.
Humans prefer low load times.
Depending on the contents and architecture of your site,
a single page on your site might load twenty resources such as fonts, styling
documents, javascript, images, pieces of advertising copy, contact details and finally the actual content.
In the end all small load times add up to a bigger, more noticeable load time.
Redirects may also be applied on top of each other.
Mistakes can lead to wrong or infinite redirects.
This problem might suddenly occur when you want to take the website down for maintenance and redirect all URLs
to an "under maintenance" page.</p>
<p>An alternative solution is possible with <a href="https://httpd.apache.org/docs/current/mod/mod_rewrite.html#rewriterule">a RewriteRule directive</a>.
This solution does not use redirects towards the clients but instead tells Apache how to fulfill the request, internally.
The following example keeps the URLs the same, regardless of the change in directory structure:</p>
<div class="highlight"><pre><span></span><code><span class="err">RewriteEngine On </span>
<span class="err">RewriteRule ^([0-9]+)/([0-9]+)/(.+)$ posts/$1/$2/$3 [NC,L]</span>
</code></pre></div>
<img alt="Sequence diagram for the rewrite model" class="uml" src="/images/ec89415b.png"><p>As a result, visiting <code>/1999/01/slug</code> gives us the resource on the server-side path <code>/posts/1999/01/slug</code>.
A RedirectMatch directive would look similar.
<code>()</code> gives us a numbered match that is loaded into the corresponding variable <code>$</code>.
The first match becomes variable <code>$1</code>, and so forth.
This rule will be applied to everything that looks like <code>numbers/numbers/any_character</code>.
It can be improved by indicating that the first match should have four numbers (for the year), and the second match
should have two numbers (for the month). The rule as is only specified <code>+</code>, 'one or more'.
The final square brackets give two more instructions.
NC indicates that the rule is not case sensitive. L indicates that if this rule matches the request it should be the
last to be applied. In other words, this finalizes the rewriting portion of the request handling process.</p>Notes on making drag-and-drop functionality with Javascript2020-06-06T00:00:00+02:002020-06-06T00:00:00+02:00Taco Steemerstag:tacosteemers.com,2020-06-06:/articles/2020-06-06-notes-on-making-drag-and-drop-functionality-with-javascript.html<p>When I wanted to make drag and drop functionality I found there were plenty of tutorials out there. I am adding a few more words on the topic because I had to find workarounds for issues that I didn't find mentioned elsewhere.</p><p>When I wanted to make drag and drop functionality I found there were plenty of tutorials out there.
<a href="https://developer.mozilla.org/en-US/docs/Web/API/HTML_Drag_and_Drop_API">This is the page I used myself</a>.</p>
<p>I am adding a few more words on the topic because I had to find
workarounds for issues that I didn't find mentioned elsewhere.</p>
<p>Drag and drop has four visual elements:</p>
<ul>
<li>The items that can be dragged</li>
<li>The area they originally came from</li>
<li>The areas they can be dropped on</li>
<li>The area it gets dropped on when the drag action ends</li>
</ul>
<p>We will call the items that can be dragged the <em>draggables</em>.
The areas that a draggable can be dropped on will be called the
<em>receivers</em>.</p>
<h3 id="topics">Topics</h3>
<p>In this article I want to discuss three topics.
For Safari the receivers need an extra attribute,
to be able to receive events that indicate the mouse cursor has left
the area while dragging.
In all browsers the receivers need an extra
attribute to be able to properly handle situations where the mouse
cursor stays inside receivers, but while doing
so has also been moved over an item inside the receiver that is itself
not a receiver.
The final topic is draggable elements that contain a textarea, and
making the textareas function as expected in the Edge and Firefox
web browsers.</p>
<h3 id="event-dragexit-does-not-work-as-expected-on-safari">Event dragexit does not work as expected on Safari</h3>
<p>I gave a different background color to receivers while the user
was dragging the draggable over the receivers.
By using a different background color the user can see which areas
are receivers. We don't want to let the user find out by
trial and error; dropping the draggable somewhere that cannot receive
a draggable will release it from the mouse cursor.
The user would have to find the draggable again,
and hope the next area would accept the draggable.
Adding the different background color when the mouse cursor enters
the receiver can be done with an eventhandler added to the ondragenter
attribute.
My addition comes in when we want to remove the background color
after the mouse cursor has passed over the reveiver.
On Firefox it is sufficient to add the eventhandler to the ondragexit
attribute. For Safari support, and possibly other browsers, we need
to add the same event handler to the ondragleave attribute as well.</p>
<h3 id="event-dragenter-has-quirky-behaviour">Event dragenter has quirky behaviour</h3>
<p>With the previous improvement we have added a new problem.
In all browsers we lose the background color on the receiver if we
move the mouse cursor over an unrelated item inside the receiver,
even though the mouse cursor has not left the receiver.
This is as expected; the item the mouse cursor is now positioned above
is not the receiver itself. As a result, the ondragexit or ondragleave
event handler will be called.</p>
<p>The problem is that the background color is not added back to the
receiver when the mouse cursor is positioned above the receiver again.
This can be solved by adding the color-change code to both the
dragenter event handler and the dragover event handler.</p>
<h3 id="text-selection-in-a-textearea-inside-a-draggable-element">Text selection in a textearea inside a draggable element</h3>
<p>The final problem we need to solve is the text selection
behavior in a textearea that is placed inside a draggable.
In Edge and Firefox the mouse-based text selection does not work.
It does work in Safari.
I have not been able to solve this yet.
Other applications I am aware of avoid this problem; when the user
clicks the draggable element they open a modal screen where the
user can edit the contents of the draggable element.
I think this work-around also gives the user a better experience
because they will always know which draggable they are typing in. </p>Why can the data we just stored on disk not be read?2020-05-31T00:00:00+02:002020-05-31T00:00:00+02:00Taco Steemerstag:tacosteemers.com,2020-05-31:/articles/2020-05-31-why-can-the-data-we-just-stored-on-disk-not-be-read.html<p>Data that our code writes to a file may not be accessible immediately after writing.
One reason is that some output stream implementations use buffers,
and prefer to write out their whole buffer in one go.
In these situations, if we want to be sure the data can be read …</p><p>Data that our code writes to a file may not be accessible immediately after writing.
One reason is that some output stream implementations use buffers,
and prefer to write out their whole buffer in one go.
In these situations, if we want to be sure the data can be read from the file right away, we flush the buffer.
Another option in Java is to open the file for writing with the <a href="https://docs.oracle.com/javase/10/docs/api/java/nio/file/StandardOpenOption.html#SYNC">StandardOpenOption.SYNC</a> parameter.
<code>SYNC
Requires that every update to the file's content or metadata be written synchronously to the underlying storage device.
</code>
Note that this sounds like it could lead to performance issues, as opposed to a single sync
when finished writing for the current task.</p>
<h2 id="the-problem">The problem</h2>
<p>While investigating a flaky end-to-end test at work,
I have found that the data may not be available after writing even if we have flushed the buffer,
the output stream implementation does not buffer,
or we used the SYNC option when opening the file for writing.</p>
<p>It is true that the SYNC option <a href="https://docs.oracle.com/javase/10/docs/api/java/nio/file/package-summary.html#integrity">comes with some cautions</a>.
As far as I can tell we are fulfilling the requirements needed to get the expected behaviour.</p>
<h2 id="how-could-the-data-not-be-on-disk">How could the data not be on disk?</h2>
<p>Several possible reasons occurred to me.
Perhaps the runtime implementation does not honor the SYNC instruction. This seems unlikely because
our error situation was occurring on an Oracle JVM implementation.
I don't expect a bug in file writing code there. </p>
<p>Neither do I expect a bug in <a href="https://en.wikipedia.org/wiki/Page_cache">the operating system disk cache</a>.
Please check out <a href="https://upload.wikimedia.org/wikipedia/commons/3/30/IO_stack_of_the_Linux_kernel.svg">the diagram</a>
listed on that page! It is incredible how many components our bytes might flow through. </p>
<p>Perhaps some other buffer is getting in between me and my data.
SSDs and hard disks have <a href="https://en.wikipedia.org/wiki/Disk_buffer">a disk buffer</a>, similar to RAM.
Disk buffer is going to be read from first before looking further.
That is the idea behind disk buffers; providing fast storage for recent or popular data.
<a href="https://drivescale.com/2017/03/whatever-happened-durability/">This article claims</a> that they have seen write
operations on SSDs take as long as 6 seconds, so the hardware write time may still be an interesting angle
if there is no disk buffer, or there is a disk buffer bypass for some reason.</p>
<p>A filesystem driver could do additional buffering.
I suppose it is possible that the filesystem driver does not honor the SYNC instruction.</p>
<p>Certainly a remote filesystem cannot be expected to honor that; it would require a lot more work to do so
across computers and I would expect quite low performance, especially if the protocol is built on <a href="https://en.wikipedia.org/wiki/Transmission_Control_Protocol">a reliable
protocol such as TCP</a>.
The guarantees such reliable protocols give require a lot more network traffic than is required for
transferring the data itself. I imagine that when very small byte chunks have to be transferred and
immediately written to disk it will require more resources because it requires more traffic. It might also
incur more delays because the bytes have to be written out in the correct order. </p>
<p>I see reliability in the face of hardware or software failure as the main benefit of direct data
synchronization. The extra network traffic required to make a protocol reliable makes it less likely that
this benefit of direct synchronization can be achieved. Several network packets may stay buffered on the
network adapter because one packet is out of order. The missing packet may never arrive because of a failure in
any of the involved systems. In that case the data inside buffered packets would not be written to disk.
Note that the same is true with disk buffers; most SSDs don't have enough power stored to write their disk
buffer to disk when power is lost. That kind of feature is called 'power loss protection'.</p>
<h2 id="other-ways-in-java">Other ways, in Java</h2>
<p>Is there another way in Java to ask for the file contents to be written to the file immediately?
The FileDescriptor class <a href="https://docs.oracle.com/javase/10/docs/api/java/io/FileDescriptor.html#sync()">has a sync method</a>.
When opening the file with the SYNC option we use the java.nio.file.Files class and receive an output stream.
The FileDescriptor class is in the java.io package.
These two avenues cannot be combined.
<a href="https://docs.oracle.com/javase/10/docs/api/java/nio/channels/FileChannel.html#force(boolean)">FileChannel's force method</a> won't help us as we need to pass an output stream to an external library.</p>
<h2 id="conclusion">Conclusion</h2>
<p>Our situation occurred in an end-to-end test.
It is not clear to me how our specific situation can be solved without adding a wait period in between.
This situation where we write to disk and immediately want to write the same data from disk should not occur
in production code.
It requires less disk I/O to keep the data in memory after writing it to disk, instead of reading it from
disk directly after writing.
That means the performance as a whole should be considerably better because RAM is considerably faster to
access than disks.
This problem should only occur in practice if you cannot change the code that writes to disk, and the code that
reads from disk.
In our production environment our testing problem is not relevant.
Still, I wonder why the data could not be read from disk even though we had asked for it to be written out
immediately.</p>How to avoid displaying directory listings on your website2020-05-23T00:00:00+02:002020-05-23T00:00:00+02:00Taco Steemerstag:tacosteemers.com,2020-05-23:/articles/2020-05-23-how-to-avoid-displaying-listings-on-your-website.html<p>Our websites contain directories with files that are not usually read by humans. Examples are directories containing Javascript files or files for XML feeds. Sometimes we want to disallow directory listings for these directory contents.</p><p>Our websites contain directories with files that are not usually read by humans.
Examples are directories containing Javascript files or files for XML feeds.
Sometimes we want to disallow directory listings for these directory contents.
Here is an example of a directory listing for my blog articles:
<img alt="Screenshot of a directory listing" src="/files-posts/images/directory_listing_example.png" title="Fig. 1: Screenshot of a directory listing"></p>
<p>Normal users of our site do not visit these directory listings.
To reach them requires adjusting the address bar by hand.</p>
<h2 id="the-problems">The problems</h2>
<p>The files in these directories are unlikely to be useful to visitors.
If they are then the visitor should be guided towards them
through your navigation structure. Then they will have the proper context to interpret these files. </p>
<p>The RSS and Atom feed files for my articles are available on my main page as well as the article category pages.
<span id="footnote-1-return">There are also some automatically generated feed files that I don't list on my website because I don't think they will
benefit people as much as the ones that I do list <a href="#footnote-1">(1)☟</a>.</span></p>
<p>A potential issue is that the files, when loaded directly from a
directory listing, may not have the header and footer they would have
when they are loaded the usual way. You may have navigation elements in the header and terms and service elements in
the footer. A document that is accessed directly trough a directory listing may not contain these elements.</p>
<p>A problem wih the example in the screenshot is that the links in that automatically generated listing
do not work. The links lead the user to an error page.</p>
<p>For these reasons I think the directory listings are not beneficial to our visitors.</p>
<h2 id="the-solution">The solution</h2>
<p>This website uses a standard webhosting plan.
The web services that serve up the websites on these webhosting plans
usually support the <a href="https://en.wikipedia.org/wiki/.htaccess">hypertext access file</a>.</p>
<p>This .htaccess file can be used to hide the directory listings by using the following entry:</p>
<p><code>Options -Indexes</code></p>
<p>The attribute name is <code>Indexes</code> because a directory listing can also be called a directory index. The <code>-</code> means 'no'.
Here is <a href="https://httpd.apache.org/docs/current/mod/core.html#options">a manual on the Options entry</a>. </p>
<p>After adding this the following message appears on my site instead of the directory listings:</p>
<pre>
Forbidden
You don't have permission to access this resource.
</pre>
<p>The .htaccess file can be placed in the root directory of your website.
Your hosting provider may have instructions on their site as well.</p>
<p>If the .htaccess file is not supported in your situation you may want
to contact your hosting provider and ask them to disable directory listings for your site.</p>
<h2 id="an-alternative-solution">An alternative solution</h2>
<p>There is a manual workaround. It requires adding a file to each directory listing that we want to hide.
To each directory that should not be listed we can add an index.html
file. Webserver software is usually configured in such a way that it will prefer sending this file to the
client instead of showing the directory listing. The file can be empty, show
a "file not found" message, and it can show your website navigation.</p>
<h4 id="footnotes">Footnotes</h4>
<p><span id="footnote-1">(1) It would be preferable these unused files would not be generated. I have not figured out how to stop these files from being generated. <a href="#footnote-1-return">☝</a></span></p>How to include HTML documents inside HTML documents2020-05-18T00:00:00+02:002020-05-18T00:00:00+02:00Taco Steemerstag:tacosteemers.com,2020-05-18:/articles/2020-05-18-how-to-include-html-documents-inside-html-documents.html<p>There are two good ways to include our static HTML documents inside the main HTML documents that make up our website.
One is to include them server-side, before the server sends the main HTML document to the web browser.
The other is client-side loading where we have Javascript on the …</p><p>There are two good ways to include our static HTML documents inside the main HTML documents that make up our website.
One is to include them server-side, before the server sends the main HTML document to the web browser.
The other is client-side loading where we have Javascript on the client (the web browser) request information from a
server, format that as HTML and append that to the main HTML document inside the web browser.
It is the best way to load dynamic content, such as a feed that receives updates.
Mozilla provides <a href="https://developer.mozilla.org/en-US/docs/Learn/JavaScript/Client-side_web_APIs/Fetching_data">a resource that explains why</a>
you would want to use this method to load dynamic data.
However, the topic of my article today is including static HTML documents, and for that purpose I personally find client-side loading
to be too involved. For example, it requires putting some thought in to error handling,
and the order in which information and styling becomes visible to people during the page loading phase.
To keep this article short I will skip client-side loading.
I do want to mention the <a href="https://developer.mozilla.org/en-US/docs/Glossary/Cross-site_scripting">security implications</a>
related to loading other people's documents into your pages.
To keep our users safe we have to assume that not all security problems can be solved by restrictions inside web
browsers, and keep an eye on the security implications of using documents and scripts that we did not make ourselves. </p>
<p>There is a third way that we will discuss first.
An iframe can be used to load an HTML document inside another document.</p>
<h2 id="iframe">iframe</h2>
<p>From talking to colleagues I know that they don't like iframes.
Perhaps they have <a href="https://en.wikipedia.org/wiki/Frame_(World_Wide_Web)#Criticism">one of the many criticisms listed here</a>.
One of the biggest downsides of iframes, also listed on that page, is that <a href="https://en.wikipedia.org/wiki/Screen_reader">screen readers</a>
have difficulties to explain them to the user.</p>
<p>In my opinion the iframe has one purpose that it serves well. It allows one to load an entirely separate page into
another page.
This can be handy for all sorts of uses such as manuals, blogs with comments, or separated functionality such as a
widget that contains a video or a comic panel.
In case of classic, book-inspired manuals the iframe can be used to create a navigation panel on one side of the page
and show the user the page or manual they asked for on the other side of the page.
Blogs with external commenting systems can use them to load the commenting functionality without letting the external
scripts have access to the other contents of the blog. For external video and picture files it has the same advantages.
If these contents from external documents were not restricted to the size of their iframe
they might be able to show a full-page advertisement. </p>
<p>Given all the criticisms of iframes we saw on that Wikipedia page earlier, we really should avoid using them.
At the time of writing I am unfortunately still using them in two locations on this website.</p>
<p>One is a page where I have some <a href="/pages/timers.html">easy to use timers</a>. I use them as a pomodoro timer and a break timer.
To avoid having to add the same HTML several times I load these timers with an iframe.
There is an important principle in computer programming: <a href="https://en.wikipedia.org/wiki/Don%27t_repeat_yourself">don't repeat yourself</a>.
The timers need HTML tags, styling and javascript code to be able to function as intended.
If I were to paste several copies of everything they need on the same page I would be violating this DRY principle.
The result would be that a single mistake would have to be fixed in several places.
As I wrote about before, <a href="/articles/2020-05-12-mistakes-will-be-made.html">we don't want to set ourselves up for mistakes</a>.</p>
<p>Another place I use an iframe is a page where I embed <a href="/pages/verticalboard.html">a board that I use</a>
when I want to track the progress of several smaller tasks I have to accomplish during my work day.
My justification for using it here is that this board is <a href="https://gitlab.com/taco.steemers/kanbanboard/">a separate application maintained in a separate codebase</a>.
If I want to use an updated version on this website I copy the files over and overwrite the ones that were there.
The naive way to avoid using an iframe for this use case is copying the actual HTML in to the
<a href="https://daringfireball.net/projects/markdown/">Markdown document</a> that my
<a href="https://getpelican.com/">static blog generator</a> uses to generate the page that ends up on my website.
The reason we should avoid doing that is that it would again lead to repetition and an increased risk of mistakes.</p>
<h1 id="server-side-inclusion">Server-side inclusion</h1>
<p>I am going to stop using iframes for these use cases.
Instead, I am going to use the <a href="https://jinja.palletsprojects.com/en/2.11.x/templates/#include">include statement</a> of
the blog generator's template language to include these HTML documents in to the main HTML document on the server,
before the server sends the main document to the client.
I will remove the head and body tags from the timer document.
That way we can include the timer document without ending up with several head and body tags, as there should be
only one of those if the document is not inside an iframe.
Luckily documents without those tags <a href="https://stackoverflow.com/a/25749523">are considered valid since at least 2014</a>.
As a result of that we will still be able to use the vertical board and the timers as stand-alone tools even when
omitting these tags.</p>
<p>The downside for the timers example is that the payload increases in size.
In the iframe situation the web browser would do
only one extra request to the server to get the embedded HTML document that contains the timer.
The web browser is smart enough to realise that it doesn't make sense to make another request for the other two timers;
the response would be the same.
In the new situation the three copies of the timer will actually have to be sent to the client in duplicate.
All three will be sent as part of the main HTML document that contains them.
In this case it is entirely acceptable because the timer HTML is small.</p>
<p>Currently, the timer Javascript and CSS are stored inside the timer HTML.
Apart from some laziness, a valid reason to keep it that way is that by not using separate files for the Javascript and
the CSS we avoid creating two extra HTTPS requests to the server.
Keeping the JS and CSS separate from the HTML is what one <a href="https://softwareengineering.stackexchange.com/a/271338">should usually do in more standard situations</a>.
When the timer documents are not in iframes anymore I will be move the JS and CSS to separate files.
In this particular situation it will come at the cost of two more HTTPS requests to the server;
one for the JS file and one for the CSS file.
If anyone is still with me at this point: we will be making one more request than we did in the original situation.
At the scale of a website like this the extra load on the server will not be noticed.</p>
<h2 id="if-we-insist-on-using-an-iframe">If we insist on using an iframe</h2>
<p>If we insist on using an iframe, let's use them well!
Iframes have some cross-browser problems that I have run in to myself.</p>
<p>One is that the difference in screen size between a computer screen and a smartphone screen can be large.
We still need to get the document inside the iframe to be displayed properly on both screen sizes,
without pushing away the content on the main document.
The difference in screen sizes could in the past easily be accounted for with CSS
<a href="https://developer.mozilla.org/en-US/docs/Web/CSS/Media_Queries">media queries</a>
in combination with device-related CSS properties like
<a href="https://developer.mozilla.org/en-US/docs/Web/CSS/@media/device-height">max-device-height</a>.
Unfortunately these device-related properties are deprecated.
The word 'deprecated' means that it has fallen out of use and is no longer supported by the browser developers.
The browser on my phone doesn't support it any more.
That is how I found out that the property is deprecated even though testing on my computer
showed that it worked as expected. </p>
<p>In my case I want the contents to be shown correctly on different sizes of screens, but I also want to show a message that explains the vertical board application does not work on small screens.
To get a similar effect today, without using the device-related CSS properties, we have to take three steps.
- we use the regular <a href="https://developer.mozilla.org/en-US/docs/Web/CSS/max-height">max-height css property</a>
on the media queries inside the iframe to hide the application on the small screen and show the message instead.
<pre>
@media only screen and (max-height: 800px) {
#application {
display: none;
}
#notmobilefriendly {
display: block;
}
}
</pre>
- we set width: 100% on the body inside the iframe
<pre>
body {
width: 100%;
}
</pre>
- on the iframe tag in the main document we add CSS to set a size based on how big the browser window is, like this:
<pre>
height: calc(100vh - 80px);
width: 100vw;
</pre>
Here I have subtracted the height of the header on the main document from the height of the window
(vh, which stands for view height) to get the correct height for the iframe.
The browser will now calculate the correct height and width for us.</p>
<p>Another problem is that the Safari webbrowser, used on Apple devices, does not handle iframe content size well.
We will skip the details and go right to the solution.</p>
<ul>
<li>We place the iframe inside a container that has some size of its own, a padded div for example:</li>
</ul>
<p><code><div style="padding: 1px;"><iframe class="timer" src="timer.html"></iframe></div></code></p>
<ul>
<li>for this situation we also set width: 100% on the body inside the iframe
<pre>
body {
width: 100%;
}
</pre></li>
</ul>
<p>Now the Safari webbrowser will also give the iframe the requested dimensions.</p>
<h2 id="web-browser-behaviour-is-a-moving-target">Web browser behaviour is a moving target</h2>
<p>To get non-standard things like dynamically-sized iframes to work predictably across all web browsers is a difficult
task. What makes it even more difficult is that the goal of having your document look the same in all browsers is a
moving target; browsers receive updates and people on different continents tend to use different browsers.
Whereas in Europe we mainly use Firefox, Chrome, Safari and Edge we find other browsers in Asia.
At the time of writing, <a href="https://gs.statcounter.com/browser-market-share/all/china">this statcounter page about browser market share in China</a>
shows that the UC Browser and QQ Browser have a combined marketshare of 22.6% there.</p>
<h2 id="keeping-the-why-in-mind">Keeping the 'why' in mind</h2>
<p>I think the best way to look at web pages is that they are a way to provide information to people.
We can't expect the pages to look the same in all web browsers.
We must make sure that the information we want to provide to people is readable in all browsers.
For that reason I find it best to keep our designs simple.
Our design can be hip, or elegant, or show our personality and that of our company,
but the information must be readable at all times. </p>Mistakes will be made2020-05-12T00:00:00+02:002020-05-12T00:00:00+02:00Taco Steemerstag:tacosteemers.com,2020-05-12:/articles/2020-05-12-mistakes-will-be-made.html<p>I occasionally make a mistake in a professional environment. Mistakes will be made, that is just how it is.
We do have to keep looking at how we handle them and make sure we are not making a habit of it.
Here I share some of my thoughts on the …</p><p>I occasionally make a mistake in a professional environment. Mistakes will be made, that is just how it is.
We do have to keep looking at how we handle them and make sure we are not making a habit of it.
Here I share some of my thoughts on the topic.</p>
<p>Last week a customer ran in to a problem that I had created by mistake.
This is not a difficult kind of situation to handle and goes something like this:
I own up to it, give a quick explanation and then fix it myself or assist a colleague in fixing it,
and when the situation allows for it I talk to all involved and provide
all the necessary details about how the mistake came to be.</p>
<p>That last part is important. Not only to the customer and your colleagues, but also for yourself.
In explaining the situation in detail you will find exactly what went wrong and what role you played in the situation.
Perhaps you zigged when you should have zagged, as they say.</p>
<p>Key to my learning has been not just the breaking but more importantly the fixing.
The mistakes made I saw as lessons learned when I realised what had happened,
and a chance to dive deeper in to what I was working on.
It feels good to learn to fix things for yourself, especially when the pressure is low.</p>
<p>Mistakes will get made. Problems will occur. Hopefully everything will be fine.
If the mistakes and the problems you created still hurt you will be okay, because you have an incentive to avoid them.</p>
<h2 id="colleagues">Colleagues</h2>
<p>During the last 9 years I was able to experience other people make mistakes in professional environments as well.
In some sense I enjoy the mistakes when they happen because they tend to be kind of harmless in the grand
scheme of things, yet a lot can be learned by paying attention when we are correcting them.
Another person's mistake and solution can be quite educational. </p>
<p>Would you have handled the situation the same way?
If not, you may want to ask what your colleague thinks about your proposed solution, after the urgency has died down.</p>
<p>Sometimes you have to interrupt then and there if you see a new problem in the proposed solution.
That is a more difficult situation and requires more insight in to the situation and people.
How certain are you? Is the potential problem a big problem?
Are you able to bring it up while keeping the discussion productive?
You may have to be a bit insistent to make sure you are heard. </p>
<p>Though at times you may feel your colleagues are taking a sub-optimal action,
a more likely situation is that you are missing some context that justify your colleagues' action.
Because of this a good approach can be to ask if the scenario you are thinking of is possible,
just to give the others the opportunity to recognize something they might be overlooking,
and then let them continue.
If you still have doubts or questions after the situation has been resolved, you can discuss them with your colleagues
at that time.</p>
<h2 id="take-note-of-the-situation-and-move-on">Take note of the situation and move on</h2>
<p>You need to realize that you are going to make mistakes. Some won't be fixable.
For example, some production data will never come back.
It depends on a lot of factors, the simplest being the granularity of the backups.
You can't just put the old backup back; all newer data would then be lost forever.
The missing data might have to be scripted back in, which brings a lof of new risk.</p>
<p>Heated discussions can occur.
Just remember it is not about you or them, it is about solving the problems at hand.
Blame is not what it is about and rarely truly needs to be assigned.
If there was no malice, ignore the blame. I have personally never seen malice, only honest mistakes.</p>
<p>Unfortunately there is a lot of carelessness as well. Carelessness is not the same as the initial inexperience that
leads to mistakes. It stings a bit when I have to give someone my time due to their carelessness,
but it is part of my job to give support where needed.
I try to avoid being the careless one.</p>
<p>Don't hold a grudge. Tomorrow there will be another day with another problem that you will resolve together.
Take note of the situation and move on.
If the company culture allows for it, it may be a good idea to
<a href="https://landing.google.com/sre/sre-book/chapters/postmortem-culture/">make a write-up to share what happened and what you have learned</a>.
In follow-up discussions all involved can walk away with more ideas for improvement.</p>
<h2 id="users-and-their-representatives">Users and their representatives</h2>
<p>I make sure to remember that even though I am sometimes far removed from the consequences of mistakes made,
there is an end user of my work and that of my team.
Perhaps it is a person who has no choice in their use of our product, but does have mouths to feed.
It is probable that I am going to make some people sad, over the course of my career.
That is kind of inevitable if you don't work in a vacuum.
Software can be very frustrating and confusing, to the layman, the professional user
and the software developers as well.
Still, we should be gracious, even if a support ticket seems unfair to us and our work.
Just as it isn't always easy to be in our position, it isn't always easy to be in their position.
The best response is a helpful response.</p>
<h2 id="there-is-your-environment-and-then-there-is-you">There is your environment, and then there is you</h2>
<p>Where does our responsibility start and end?
If we have a responsibility in some area,
we should also have a way to influence the outcome of the actions taken in that area.</p>
<p>The new senior software developer who was replacing me didn't appear to be listening when I tried to warn him about the
inadequate deployment procedure, and the production database credentials that were included in the version control
system. I was there as a contractor for only five months and tasked with creating two web applications in a domain that
was new to me. I had no influence and no time to improve that part of the organization.
It did feel bad when a few hours later some production data was lost. This is one of the many situations that I have
learned from. Here I learned that I should trust my instinct and be more insistent the next time I feel that such a
warning is not properly heard.</p>
<p>It is up to the workplace as a whole to get safe procedures in place.
We can and must advocate for safe procedures.
In the meantime we do what we can to avoid making mistakes and creating problems.
We should also acknowledge that there are limits to what we can do by ourselves.
We are always dependent on others.
If situations in your environment force you to log in to a production database,
and you then proceed doing so, there is a chance of you doing some damage.
You should in your mind take responsibility for taking the risky action.
You should also recognise that apparently your environment is pushing you towards situations
where you are likely to create problems at some point.
Can you and your colleagues improve the workflow?
In my opinion it is always worth it to at least to have the discussion,
and it is worth the effort it takes to get time scheduled for an investigation in to better practices.</p>
<h2 id="avoiding-mistakes-by-avoiding-risky-situations">Avoiding mistakes by avoiding risky situations</h2>
<p>Mistakes are most likely to be made when we are tired, distracted,
or not fully aware of the details of the system we are working in.
The easiest way to avoid creating big problems is to not let ourselves be set up for a situation
where mistakes are likely to be made and likely to have a meaningful impact.
Because mistakes are never just likely. I feel that in workflows where mistakes are likely, they are also inevitable.</p>Still searching for a daily use computer that just works2020-04-26T00:00:00+02:002020-04-26T00:00:00+02:00Taco Steemerstag:tacosteemers.com,2020-04-26:/articles/2020-04-26-still-searching-for-a-daily-use-computer-that-just-works.html<p>Unfortunately there have been several problems with my macOS install,
and with the Unix ports I install with <a href="https://brew.sh/">brew</a>. </p>
<p>There has been a time when the Finder appeared to not get an updated list of <a href="https://en.wikipedia.org/wiki/Inode">inodes</a>.
After moving files with the Finder the files appeared in the original directory as …</p><p>Unfortunately there have been several problems with my macOS install,
and with the Unix ports I install with <a href="https://brew.sh/">brew</a>. </p>
<p>There has been a time when the Finder appeared to not get an updated list of <a href="https://en.wikipedia.org/wiki/Inode">inodes</a>.
After moving files with the Finder the files appeared in the original directory as well as the updated directory.<br>
Confused by this you might think that you had accidentally copied the files instead of moving them.
Deleting 'the original files' would lead to deleting the files in their new directory.
They wouldn't be available in their new directory anymore. After all, there were never any copies.
It was just that the Finder was still showing them in the original directory.
I don't know if the Trash would have been able to restore them; I have the bad habit of bypassing the thrash can.
An old habit from when disk sizes were small.
So I resorted to always using the commandline even for file-related tasks that the Finder was able to handle reasonably
efficiently in the past, using the Finder only for the image previews.<br>
On one recent morning the Finder started working correctly again.</p>
<p>There have been other problems such as sound not working anymore.</p>
<p>Twice brew has appeared to become entirely confused, a lot of packages had to be reinstalled.
The second time I reinstalled brew itself as well if I remember correctly.</p>
<p>Today I had to reinstall <a href="https://nmap.org/">nmap</a>. I don't know how it broke. It worked a few days earlier.</p>
<div class="highlight"><pre><span></span><code>$ nmap XXX.YYY.ZZZ.0/24
dyld: Library not loaded: /usr/local/opt/openssl/lib/libssl.1.0.0.dylib
Referenced from: /usr/local/bin/nmap
Reason: image not found
Abort trap: <span class="m">6</span>
$ brew install nmap
...
Error: nmap <span class="m">7</span>.70 is already installed
To upgrade to <span class="m">7</span>.80_1, run <span class="sb">`</span>brew upgrade nmap<span class="sb">`</span>.
$ brew uninstall nmap
Uninstalling /usr/local/Cellar/nmap/7.70... <span class="o">(</span><span class="m">807</span> files, <span class="m">26</span>.8MB<span class="o">)</span>
$ brew install nmap
</code></pre></div>
<p>After reinstalling it worked again.</p>
<p>The reason I switched from using only Linux to using macOS for my laptop was that I was getting a bit tired after +- 15
years of fixing problems with my personal Linux installs.
I wanted something that just works. Unfortunately I can't say that has really become true.
What I will say though is that the general quality of this 2018 MacBook Pro is quite good.
There have been few crashes, no unintended openings or cracks have appeared on the laptop,
and there have been no bulging heat pipes so far. Fingers crossed!</p>
<p>Just make sure not to get any
<a href="https://www.cnet.com/news/apple-may-be-working-on-a-crumb-resistant-macbook-keyboard/">breadcrumbs in your keyboard</a>.</p>
<p>And make sure to
<a href="https://apple.stackexchange.com/questions/363337/how-to-find-cause-of-high-kernel-task-cpu-usage/363933#363933">charge it from the right side ports</a>. </p>Quick text manipulation, a practical `sed` example2020-04-22T00:00:00+02:002020-04-22T00:00:00+02:00Taco Steemerstag:tacosteemers.com,2020-04-22:/articles/2020-04-22-quick-text-manipulation-a-practical-sed-example.html<p>Suppose you have received a 10k line file of text in a format that is difficult for you to work with, like XML.
You want to get some specific information from that file,
and realize that getting that information by hand will take a long while.</p>
<p>In some cases an …</p><p>Suppose you have received a 10k line file of text in a format that is difficult for you to work with, like XML.
You want to get some specific information from that file,
and realize that getting that information by hand will take a long while.</p>
<p>In some cases an editor like IntelliJ IDEA can be really useful (see footnote),
but a full-blown IDE may not what be you are looking for.
You may want to use what you have, or you may need it as part of a script.
Here I want to show you an example of how I have found <a href="https://en.wikipedia.org/wiki/Sed"><code>sed</code> the stream editor</a> quite useful.
It is available on Linux and macOS without having to install anything.</p>
<p>Our input file has tags (names) and values for many types of data. We only want the list of values of the identifier
tags of a specific type of parent tag.
Let's say we are working with some <a href="https://en.wikipedia.org/wiki/Enterprise_resource_planning">enterprise resource planning</a>
applications and are looking for a list of widget identifiers that are in the input file.
For example, we want to use these to correlate data between two systems such that we can a put together a more
complete dataset for a once-yearly report.
Luckily for us that identifier we need is on the line after the opening tag of the widget datatype.
Once we have the list of identifiers we want to use those to get more information from an SQL database.</p>
<p>To clarify, the input data we are interested in looks like this:
<pre><code>
...
<widget>
<id>6Q</id>
...
</code></pre></p>
<p>Let us look at the different steps and how <code>sed</code> fits in.
First we make the output file that we want to store the output in.</p>
<p><code>touch output.txt;</code></p>
<p>Then we want to search the input file for the <code><widget></code> tags, when we find one we go to the next line and just
forget about the line that we found the <widget> token on. </p>
<p><code>touch output.txt;
sed -n '/<widget>/ {n;p;}' < input.xml</code></p>
<p>Here we have asked <code>sed</code> to go to the <code>n</code>ext line and <code>p</code>rint that if we have found <code><widget></code>.
By specifying <code>-n</code> we ask <code>sed</code> not to print anything other than what we specifically asked it to. </p>
<p>Now we will have a list of about 2k identifiers, but they still have their tags, like so: <code><id>6Q</id></code>.
We don't want those tags. Neither do we want the whitespace around the tags.</p>
<p>So let us pipe the stream that sed outputs into the next steps using that vertical thing called the pipe, <code>|</code>.
For example. we will tell sed to replace <code><id></code> with nothing. We do that by using the pattern <code>s/existing text we don't want/new text we do want</code></p>
<p><code>touch output.txt;
sed -n '/<widget>/ {n;p;}' < input.xml | sed 's/<id>//g' | sed 's/<\/id>/,/g'</code></p>
<p>Now we have removed <code><id></code> and replaced <code></id></code> with a comma.</p>
<p>Because we don't use any additional flag in the new sed commands we can also combine them in one call with the <code>-e</code>
flag.
I believe that will be faster because there is no more data transfer from one process to another.
The version of <code>sed</code> on my Linux computer does not seem to support the <code>-e</code> flag though.
If you are using <code>sed</code> on Linux you might need to keep piping the commands to each other.</p>
<p><code>touch output.txt;
sed -n '/<widget>/ {n;p;}' <input.xml | sed -e 's/<id>//g' -e 's/<\/id>/,/g'</code></p>
<p>Next we remove the whitespace that we don't want.</p>
<p><code>touch output.txt;
sed -n '/<widget>/ {n;p;}' < input.xml | sed -e 's/<id>//g' -e 's/<\/id>/,/g' -e 's/ //g' -e ':a;N;$!ba;s/\n/ /g' > output.txt</code></p>
<p>We have replaced the whitespace with nothing. What comes next is difficult to read.
Why not just use <code>sed 's/\r\n//g'</code> to replace <code>\n</code> or <code>\r\n</code> with nothing?
The newlines and carriage returns <a href="http://sed.sourceforge.net/sedfaq5.html#s5.10">will not be seen by <code>sed</code></a>
because <code>sed</code> will normally work on the actual contents of each line, line after line.
We have to do more work or switch to using <a href="https://en.wikipedia.org/wiki/Tr_(Unix)">tr, which is a simpler way to swap text</a>.
A good explanation of what the pattern does <a href="https://stackoverflow.com/a/7697604">can be found here</a>.
In short, we mark a position as <code>a</code>, and add the next line with <code>N</code>. <code>$!ba</code> means we keep doing this until the whole
file is inside our computer memory. If we are not at the last line, <code>$!</code>, we move <code>b</code>ack to position <code>a</code> and keep
going.
This way <code>sed</code> can handle all input in one go, including newlines.</p>
<p>The result has been written to the file <code>output.txt</code> which now contains all widget id values separated by a comma.
The last widget id also ends in a comma, that needs to be removed as well.
Here <code>sed</code> comes to the rescue again with <code>$</code> which will refer to the last character.
<code>sed 's/.$//' < output.txt</code> will remove the last comma.
Now we can use it to make a select statement on a database table:
<code>SELECT name, quantity_sold, quantity_unit FROM widgets WHERE objectid IN (...)</code> where we fill the brackets with the
list of ids we created.</p>
<p><code>echo "SELECT name, quantity_sold, quantity_unit FROM widgets WHERE objectid IN ("; sed 's/.$//' < output.txt; echo ");"</code></p>
<p>will then result in the following format</p>
<p><code>SELECT name, quantity_sold, quantity_unit FROM widgets WHERE objectid IN (
1Z, 2G, 3A, 4T, 5H, 6Q, 7P
);</code> </p>
<p>There we have it, the full query.</p>
<p>The Grymoire probably has <a href="https://www.grymoire.com/Unix/Sed.html">the best <code>sed</code> page I have seen</a>, go check it out!</p>
<p>IntelliJ IDEA footnote:
It offers Column Selection Mode and <a href="https://blog.jetbrains.com/idea/2014/03/intellij-idea-13-1-rc-introduces-sublime-text-style-multiple-selections/">multiple carets</a>.
Note that the linked blog post is quite old, I suspect more is possible these days.</p>I found myself working hard instead of smart2020-04-16T00:00:00+02:002020-04-16T00:00:00+02:00Taco Steemerstag:tacosteemers.com,2020-04-16:/articles/2020-04-16-i-found-myself-working-hard-instead-of-smart.html<p>Ah, the often mentioned "work smart, not hard". As well as "not invented here"...</p>
<p>I had set out to re-launch my blog.
Somehow, in my enthusiasm, I ended up writing a static website generator almost to completion.
The only two remaining topics were the two topics that most looked like …</p><p>Ah, the often mentioned "work smart, not hard". As well as "not invented here"...</p>
<p>I had set out to re-launch my blog.
Somehow, in my enthusiasm, I ended up writing a static website generator almost to completion.
The only two remaining topics were the two topics that most looked like work instead of hobby.
Though I love both learning and programming, there are limits to what I will do in my free time.
The remaining work lead me to thinking about how I could reduce the amount of work by scrapping features and by
intelligent reuse of software projects I had already lying around.
That was probably the first time I approached this new hobby project as a serious project, and the flaws in what I was
doing immediately became clear to me.</p>
<p>Instead of doing things that I find challenging such as writing an interesting and readable text and then putting that
work out there to be seen, I was doing something that I find easy and enjoyable; I was using my blog as a reason to
refresh my shell scripting skills.
Another flaw in my approach was that I had not approached the blog re-launch as a serious project and had not done some
things I always do when I do work for someone else.</p>
<p>By working hard instead of smart after failing to create any type of design document and keeping every idea in my head.
By deciding not to use existing tooling because they are not as fun and flexible to use as I thought my own static
website generator would be. While trying to use my own static website generator for this re-launch I found plenty of
remaining work which refute those points; the last 20% of the work tends to be as much work as the first 80% of the work.</p>
<p>Luckily I did set a deadline for the initial objective, the re-launch of my blog, which has not been delayed by much.
The question is now which road to I will take; will I continue on with my own static website generator or will I use
Pelican again like I am doing with my current blog. (Edit: I am using Pelican now and I do enjoy it.)</p>
<p>Either way, it was a refreshing learning experience. It was all on me and there were no external factors that could be
blamed. That makes it easier to learn from the situation than in most situations that arise on the job.</p>Connecting to your printer on a Linux system2015-02-08T00:00:00+01:002015-02-08T00:00:00+01:00Taco Steemerstag:tacosteemers.com,2015-02-08:/articles/2015-02-08-connecting-to-your-printer-on-a-linux-system.html<p>If you are using a GNU/Linux distribution and are having trouble finding your printer in print dialog screens, the
following may be of help to you.
Make sure you have CUPS installed, the <a href="http://www.cups.org/">Common Unix Printing System</a>. Start cups first, on my
system this can be done by executing …</p><p>If you are using a GNU/Linux distribution and are having trouble finding your printer in print dialog screens, the
following may be of help to you.
Make sure you have CUPS installed, the <a href="http://www.cups.org/">Common Unix Printing System</a>. Start cups first, on my
system this can be done by executing</p>
<div class="highlight"><pre><span></span><code><span class="err"># /etc/init.d/cups start</span>
</code></pre></div>
<p>CUPS 1.5.3 uses <code>http://localhost:631/</code> as the configuration utility.
On my install, I was unable to print anything because no printer was found. Unfortunately I was also unable to add a
printer.
I had to edit the configuration file, which is located at <code>/etc/cups/cupsd.conf</code>
Be sure to make a backup, if you do decide to edit it.
I changed all lines containing</p>
<div class="highlight"><pre><span></span><code><span class="err">"DefaultAuthType ..." into "DefaultAuthType none"</span>
<span class="err">"AuthType ..." into "AuthType none"</span>
</code></pre></div>
<p>and removed all lines such as <code>"Require user @SYSTEM"</code>
Then I was able to find my printer and print.</p>
<p>There are probably ways to get more fine-grained access control that still allows you to print, but frankly, when you
need to print something you need to print it right then, and not a couple of days later. The mentioned edits might be
seen as bad for security, but for a small network the risk seems low. Especially when compared to not being able to
print.</p>Programmatically creating scalable vector graphics (SVG)2015-02-07T00:00:00+01:002015-02-07T00:00:00+01:00Taco Steemerstag:tacosteemers.com,2015-02-07:/articles/2015-02-07-programmatically-creating-scalable-vector-graphics-svg.html<p>This is a small note on programmatically creating
<a href="https://en.wikipedia.org/wiki/Scalable_Vector_Graphics">scalable vector graphics</a>. For this we use Python
with <a href="https://pypi.python.org/pypi/svgwrite">svgwrite</a> which was simply the first tool I found. We will not be
comparing different tools.</p>
<p>When creating graphics for posters, programs, or the web there are some advantages in using scalable vector …</p><p>This is a small note on programmatically creating
<a href="https://en.wikipedia.org/wiki/Scalable_Vector_Graphics">scalable vector graphics</a>. For this we use Python
with <a href="https://pypi.python.org/pypi/svgwrite">svgwrite</a> which was simply the first tool I found. We will not be
comparing different tools.</p>
<p>When creating graphics for posters, programs, or the web there are some advantages in using scalable vector graphics
over regular graphics such as PNG and JPG. SVG is scalable (resizeable) without any loss of detail, or 'fuzzyness'. </p>
<p>Here we see the output of <a href="https://tacosteemers.com/files-posts/code/svg/using_svgwrite.py">the source code used in this example</a>. You can zoom in
as much as you like without the graphic looking 'pixelated'. This is becasue SVG does not use pixels to describe the
graphic, it uses <a href="https://en.wikipedia.org/wiki/Vector_image_format">vectors</a>. Note that the program that you use to
look at the SVG file may limit how far you can zoom; the browser I used to proof-read my post only allows a 2x zoom.
However, there are no technical limitations.</p>
<p><a href="https://tacosteemers.com/articles/using-svgwrite.svg"><img src="https://tacosteemers.com/articles/using-svgwrite.svg" width="400" height="400"/></a></p>
<p>The basic idea is that we create an object with a shape and a location. Then we add that object to our 'canvas', the
SVG document.
Like so:</p>
<div class="highlight"><pre><span></span><code><span class="o">#</span> <span class="n">Creating</span> <span class="n">a</span> <span class="n">canvas</span>
<span class="n">svg_document</span> <span class="o">=</span> <span class="n">svgwrite</span><span class="p">.</span><span class="n">Drawing</span><span class="p">(</span><span class="n">filename</span> <span class="o">=</span> <span class="ss">"using-svgwrite.svg"</span><span class="p">,</span> <span class="k">size</span> <span class="o">=</span> <span class="p">(</span><span class="ss">"100px"</span><span class="p">,</span> <span class="ss">"100px"</span><span class="p">))</span>
<span class="o">#</span> <span class="n">Creating</span> <span class="n">a</span> <span class="n">line</span>
<span class="n">lineA</span> <span class="o">=</span> <span class="n">svg_document</span><span class="p">.</span><span class="n">line</span><span class="p">((</span><span class="n">xStart</span><span class="p">,</span> <span class="n">yStart</span><span class="p">),</span> <span class="p">(</span><span class="n">xEnd</span><span class="p">,</span> <span class="n">yEnd</span><span class="p">),</span> <span class="n">stroke_width</span> <span class="o">=</span> <span class="mi">1</span><span class="p">,</span> <span class="n">stroke</span> <span class="o">=</span> <span class="n">colorA</span><span class="p">)</span>
<span class="o">#</span> <span class="k">Placing</span> <span class="n">the</span> <span class="n">line</span> <span class="k">on</span> <span class="n">the</span> <span class="n">canvas</span>
<span class="n">svg_document</span><span class="p">.</span><span class="k">add</span><span class="p">(</span><span class="n">lineA</span><span class="p">)</span>
</code></pre></div>
<p>It is that simple.</p>
<p>In the source code we use two different code paths for the 'horizontal' and the 'vertical' four-leaf clover. If the
code had better structure, and the methods returned elements rather than adding them to the document immediately, we
could have used the <a href="http://pythonhosted.org/svgwrite/classes/mixins.html#transform-mixin">'rotate' transformation</a>
to rotate the four-leaf clover as we wished.</p>
<p>One of the clovers doesn't look quite right. I was not able to get the 'swirl' to look right on all four leafs. Can
you solve that?</p>
<p>You can create prettier patterns than the ones shown here, but I have deliberately not included mine. It is much more
fun to try creating your own patterns than it is to look at someone else's!</p>
<p>Another library you might want to look at is Cairo, which has bindings for many different languages, according
to <a href="https://en.wikipedia.org/wiki/Cairo_%28graphics%29">Cairo's Wikipedia page</a>.</p>
<p>The source was run with Python 2.7.3 and svgwrite 1.1.6.</p>Setting up a Secure FTP server (SFTP)2015-01-29T00:00:00+01:002015-01-29T00:00:00+01:00Taco Steemerstag:tacosteemers.com,2015-01-29:/articles/2015-01-29-setting-up-a-secure-ftp-server-sftp.html<p>We want to set up a secure FTP server (let us call this 'the service', to avoid confusion). This service will receive
backups.
The service (and it's clients) don't need access to any unrelated commands. So we will make an empty <code>PATH</code> and won't
let the users perform a normal …</p><p>We want to set up a secure FTP server (let us call this 'the service', to avoid confusion). This service will receive
backups.
The service (and it's clients) don't need access to any unrelated commands. So we will make an empty <code>PATH</code> and won't
let the users perform a normal login.
The service only needs to have access to one directory. We will attempt to restrict it to that directory by using the
<code>chroot</code> utility ('change root') which will restrict the service's view of the server's filesystem. The service will
need to be able to find all it's dependencies as well. One complication is that this will entail having to place the
service and it's dependencies in a location that is not known to the normal system update service. I considered trying
to fix this by automatically copying the updated versions over the new versions and restarting the service, until I
realized a symlink would probably be much better.</p>
<p>As it turns out, there is a good but shallow tutorial over at
<a href="http://www.debian-administration.org/article/590/OpenSSH_SFTP_chroot_with_ChrootDirectory">Debian Administration</a>.
This tutorial shows the correct settings to set in <code>/etc/ssh/sshd_config</code>.
These are 'Subsystem sftp internal-sftp' and</p>
<div class="highlight"><pre><span></span><code><span class="err">Match group sftponly</span>
<span class="err"> ChrootDirectory /home/%u</span>
<span class="err"> X11Forwarding no</span>
<span class="err"> AllowTcpForwarding no</span>
<span class="err"> ForceCommand internal-sftp</span>
</code></pre></div>
<p>According to this tutorial, the system described there does not suffer from the mentioned update problems. We can use
this tutorial. Be sure to set 'AllowTcpForwarding no' so that your service cannot be used as a proxy.
Note that <code>internal-sftp</code> is not the binary that we run, that is still the regular ssh server. It is an instruction to
use a version of the sftp service that can work in combination with chroot. Only that instruction will be run when a
user in <code>sftponly</code> tries to use the service.</p>
<p>Unfortunately, there seems to be no way to have a user use a chrooted SFTP service while still allowing their accounts
to easily be used for other services. This is because one will want to let these other services store files in the
user's home directory, which the user will be able to access. This access is not always desired. The reason that one
cannot constrain the SFTP user to a specific directory in the home directory of the user, has to do with the proper
usage of access patterns when using chroot, and the access patterns enforced by sshd. If we want to use <code>/home/%u</code> as
the chroot directory, we must allow only root to manipulate that directory. As a result, we need a directory that the
user has privileges in, let's make that <code>/home/%u/sftp</code>. However, when the user connects to SFTP, they will be dropped
into their home directory, <code>/home/%u</code>. We removed their privileges for this directory. To let the user be dropped into
<code>/home/%u/sftp</code> instead, we need to make that their home directory. This makes it difficult to store files that should
not be accessible over SFTP.</p>
<p>Let's start setting up the service. First, we create a group <code>sftponly</code> to which we will add the user accounts intended
for sftp. We remove the users from other groups <insert example of these commands>. From here, we can largely follow
the other tutorial. To troubleshoot problems, stop the sshd service and use <code>/usr/sbin/sshd -d</code> which will give easy
access to debug logging. To allow the sshd to do it's work, we have to set <code>chmod 755 /home/myuser/</code> and
<code>chmod 755 /home/myuser/sftp</code>. Preferable I would only have let the owners interact with the intended directories,
using <code>chmod -R o-rwx .</code> on /home/myuser. However, this will cause SFTP to malfunction; it cannot drop the user in the
user's directory, as it does not have access to it. We set the user's home directory to the <code>files</code> directory inside
the user's SFTP root directory; <code>usermod -d /sftp myuser</code>. At this point, the user will be dropped in
<code>/home/%u/sftp/files</code> when they have succesfully connected. To stop the user from logging in over SSH, we disable
their shell; <code>usermod -s /sbin/nologin myuser</code>. If the user tries to SSH, they will be told that 'This service allows
sftp connections only.'</p>Some notes on trying out Crashplan, Duplicati and BackupPc2014-08-30T00:00:00+02:002014-08-30T00:00:00+02:00Taco Steemerstag:tacosteemers.com,2014-08-30:/articles/2014-08-30-some-notes-on-trying-out-crashplan-duplicati-and-backupc.html<h1 id="observations-after-testing-crashplan">Observations after testing Crashplan.</h1>
<p>The Crashplan test-setup has not been able to connect for a while. As a result these notes are partially from
memory and could not be revisited.
The user interface is confusing, buttons are placed far away from the context they operate in.
There are buttons that …</p><h1 id="observations-after-testing-crashplan">Observations after testing Crashplan.</h1>
<p>The Crashplan test-setup has not been able to connect for a while. As a result these notes are partially from
memory and could not be revisited.
The user interface is confusing, buttons are placed far away from the context they operate in.
There are buttons that fold out UI elements when you click on the buttons, but not a lot of extra information appears.
It would have been better if more information is grouped together under fewer buttons.
The Linux and Windows versions have inconsistent user interfaces.
At some point my test setup suddenly stopped connecting. I tried several command-line actions; either recommended or
not recommended. It didn't help.
The restore-screen has a nice filter for finding the file that you want.
There are confusing problems with files that were recently removed. It looks like it does not let you restore a file
if it does not realise yet that the file has recently been changed or restored. Or perhaps it is related to the files
being in the trashcan?
For reasons that are unclear it takes several completed backups before a 4GB file is actually backed-up.
The restore-list would show a 0 byte file during that time.</p>
<h1 id="observations-after-testing-duplicati">Observations after testing Duplicati.</h1>
<p>It supports transfer over SSH which is great. I love tried and true technology.
Unfortunately I have not been able to restore any files. The restore-screen requires a lot of clicking.
It keeps giving warnings, even if you are not interested.
There are very detailed descriptions of how Duplicati handles copying locked files.
Even so, it is unclear to me how successful it tends to be in backing up locked files.
Note that when you want to use SFTP, you should select 'SSH-based' rather than 'FTP-based'.
This is because FTP and SFTP are not the same thing, and SFTP makes use of an SSH connection.
Make sure to check the box 'Ignore file modification timestamp when making incremental backups'.
Otherwise, Duplicati will only use a very simplistic measure of determining whether to make a backup
(the file modification timestamp) which will fail when you get a file with an older timestamp than the one known to
Duplicati. This can easily happen if you get the latest version of the file from a different computer
(or a service running on a different computer, such as a mail service).
A discussion of that can be found here:
<a href="https://code.google.com/p/duplicati/issues/detail?id=911">https://code.google.com/p/duplicati/issues/detail?id=911</a></p>
<h1 id="observations-after-testing-backuppc">Observations after testing BackupPc</h1>
<p>BackupPc appears to perform the base required functions as expected. Files are being backed-up and they are restorable.
Additional positive features are that the files are stored in such a manner that hey appear to be manually restorable with ease.
The server-side configuration of the clients is finicky. It can take several tries to find the correct format of listing the client address. Once this is done there is no hassle.
BackupPc can mail reports, but I have not tried this.
The downside, for my use case, is that I have not been able to think of an easy way to notify the client that a backup is running or has finished.</p>
<h1 id="conclusion">Conclusion</h1>
<p>As it stands, I am not satisfied that I have found a workable solution. Currently I am considering to use the backup
utilities supplied with the Windows OS. These work over SMB shares. Initially this was not seen as an option because
this is not CryptoLocker-safe. To try to hack on such safety one might make a server-side copy of the share at regular
intervals. Mirroring would be a mistake; the crypto-locked files will also be mirrored. Considering this, a simple
mirror would provide no utility against CryptoLocker-like malware. To be more robust, the server-side backup of the
backup target will have to contain versions of several moments in time. This will require additional storage space.
The amount of required storage space can be reduced by using deduplication. This backup-of-a-backup (with deduplication)
could perhaps be provided by the previously mentioned BackupPc.
Unfortunately, SMB sharing has also let me down; the shares are often not reconnected on startup. Instead we find the
notice that 'the network name cannot be found'. This is curious, because a ping using that network name is successful.
After disconnecting and reconnecting</p>
<p>In this scheme, the clients would use the user-friendly Windows-native backup tools to backup to the share.
These will be versioned backups. Let us call this the client backup collection. The server would make this process
robust against malware by storing several versions of the client backup collection in a location that cannot be visited
by clients. If the client becomes infested with malware that attacks their files, the client backup collection will most
likely also be attacked. If we have infested clients, assuming that we are looking at a small network consisting of
around ten devices, a single operator could take the following steps:
First:
- Stop all network connectivity.
- Inspect the devices to see which require further care. Easier said than done.
The follwing actions could be performed in parralel:
- Re-load/re-install the client OS. Make sure not to attach an invested device to a vulnerable network or device.
It seems wise to keep the newly cleaned devices disconnected for now.
- On the server, bring the client backup collection back to the latest state before infection.
Once it is concluded that all devices are clear:
- Reconnect them
- Restore their files using the OS-provided tools. Depending on network performance this may have to be performed in
more than one batch.</p>Configuring an Apache installation for use with the SSL protocol2014-01-12T00:00:00+01:002014-01-12T00:00:00+01:00Taco Steemerstag:tacosteemers.com,2014-01-12:/articles/2014-01-12-configuring-an-apache-installation-for-use-with-the-ssl-protocol.html<p>Is your ownCloud client saying <code>Failed to connect to ownCloud: Connection refused</code>?</p>
<p>A possible cause could be that the webserver that is serving your ownCloud does not have SSL enabled. In this note I
will describe how I did that for my own Apache 2 install. If you do a …</p><p>Is your ownCloud client saying <code>Failed to connect to ownCloud: Connection refused</code>?</p>
<p>A possible cause could be that the webserver that is serving your ownCloud does not have SSL enabled. In this note I
will describe how I did that for my own Apache 2 install. If you do a websearch for <code>apache2 ssl</code> you will probably
find many search results, but none of the pages I found applied to the install I had - all used different files and
directories. For that reason I am posting this note.</p>
<p>If you are using the Apache web server, a version close to 2.2, you can probably enable SSL the way it is outlined in
this note. To find out which version of Apache my server has, I ran <code>apache2 -v</code> on it.</p>
<div class="highlight"><pre><span></span><code>$ apache2 -v
Server version: Apache/2.2.22 <span class="o">(</span>Debian<span class="o">)</span>
Server built: Mar <span class="m">4</span> <span class="m">2013</span> <span class="m">22</span>:05:16
</code></pre></div>
<p>We will now create a private key and a certificate, but before we do that, we should create and navigate to the
<code>/us/lib/apache2/ssl/</code> directory. Our server is called server1. This is what we will enter as the 'common name'
when we are asked for it.
We can create a key/certificate pair with the following example command:</p>
<div class="highlight"><pre><span></span><code><span class="err">openssl req -new -newkey rsa:2048 -nodes -keyout server1.key -out server1.csr</span>
</code></pre></div>
<p>It willl probably make sense to add something like <code>-days 365</code>, which indicates how long the certificate should be
valid. In my case it does not seem necessary, as both server and clients are on my personal network.</p>
<p>Now we need to tell Apache to use it. We make sure the top of our site configuration, which is contained in
<code>/etc/apache2/sites-available/default</code> by default, looks like this:</p>
<div class="highlight"><pre><span></span><code><span class="err"><VirtualHost *:443></span>
<span class="err"> ServerAdmin webmaster@localhost</span>
<span class="err"> ServerName server1:443</span>
</code></pre></div>
<p>We also add the following:</p>
<div class="highlight"><pre><span></span><code><span class="err"> SSLEngine on</span>
<span class="err"> SSLCertificateKeyFile /etc/apache2/ssl/server1.key</span>
<span class="err"> SSLCertificateFile /etc/apache2/ssl/server1.crt</span>
</code></pre></div>
<p>We will instruct Apache to use <a href="https://httpd.apache.org/docs/current/mod/mod_ssl.html">the 'mod_ssl' module</a>,
which uses <a href="http://www.openssl.org/">OpenSSL</a>. Install it if it isn't installed yet (check <code>/usr/lib/apache2/modules/</code>
to see if it is installed).
We can use <code>a2enmod ssl</code> and <code>a2ensite default-ssl</code> to enable 'mod_ssl' for us. The latter enables it specifically for
the website listed in <code>/etc/apache2/sites-available/default</code>.</p>
<p>We can also do it manually. If we check which files are listed in <code>/etc/apache2/mods-available</code>, we should find
<code>ssl.conf</code> and <code>ssl.load</code>. We will now create symbolic links to these files in the <code>/etc/apache2/mods-enabled</code>
directory, that way Apache knows we want these mods to be enabled.</p>
<div class="highlight"><pre><span></span><code><span class="err">cd /etc/apache2/mods-enabled</span>
<span class="err">ln -s ../mods-available/ssl.conf ssl.conf</span>
<span class="err">ln -s ../mods-available/ssl.load ssl.load</span>
</code></pre></div>
<p><code>mod_ssl</code> is now enabled.</p>
<p>Now the apache2 server needs to be restarted. One can use</p>
<div class="highlight"><pre><span></span><code><span class="err">service apache2 reload</span>
</code></pre></div>
<p>on modern Debian(-based) installs.</p>
<p>Your owncloud should now be reachable on 'https://<code><server></code>/owncloud'. Of course, your ownCloud client and web browser
will ask you if you trust this self-signed certificate.</p>
<div class="highlight"><pre><span></span><code><span class="n">Warnings</span> <span class="n">about</span> <span class="k">current</span> <span class="n">SSL</span> <span class="k">Connection</span><span class="p">:</span>
<span class="n">The</span> <span class="k">host</span> <span class="n">name</span> <span class="n">did</span> <span class="k">not</span> <span class="k">match</span> <span class="k">any</span> <span class="k">of</span> <span class="n">the</span> <span class="k">valid</span> <span class="n">hosts</span> <span class="k">for</span> <span class="n">this</span> <span class="n">certificate</span>
<span class="n">The</span> <span class="n">certificate</span> <span class="k">is</span> <span class="k">self</span><span class="o">-</span><span class="n">signed</span><span class="p">,</span> <span class="k">and</span> <span class="n">untrusted</span>
<span class="p">...</span>
<span class="p">...</span>
</code></pre></div>
<p>In this case I'm fine with this - I can check the certificate details myself, and am only really using the certificate
to get my own ownCloud client working with my own ownCloud server.</p>Memory for the SuperMicro X7SPA-H-D525-02014-01-12T00:00:00+01:002014-01-12T00:00:00+01:00Taco Steemerstag:tacosteemers.com,2014-01-12:/articles/2014-01-12-memory-for-the-supermicro-x7spa-h-d525-0.html<p>I recently bought the SuperMicro X7SPA-H-D525-0 motherboard. The Micron memory
listed as compatible by SuperMicro was no longer available. I then checked
Kingston's compatibility list. It listed two options for this motherboard.</p>
<p>Some blog- and forum posts mentioned using similar Kingston memory for similar
SuperMicro motherboards. Taking the Atom D525 …</p><p>I recently bought the SuperMicro X7SPA-H-D525-0 motherboard. The Micron memory
listed as compatible by SuperMicro was no longer available. I then checked
Kingston's compatibility list. It listed two options for this motherboard.</p>
<p>Some blog- and forum posts mentioned using similar Kingston memory for similar
SuperMicro motherboards. Taking the Atom D525 CPU in to account, I decided to
order the first stick on Kingston's compatibility list. It didn't work.</p>
<p>I hadn't been able to find anyone mentioning my exact motherboard. There were
plenty using similar motherboards (such as the HF, with IPMI) and all seemed
to use Kingston, or memory that just wasn't available anymore.
So I went ahead and ordered the second stick on Kingston's compatibility
list. I think you can guess where this is going...</p>
<p>It didn't work. After a couple additional evenings of searching I finally
found someone mentioning that they used this specific motherboard in
combination with the Corsair CMSO2GX3M1A1333C9 (I lost this reference).
I can confirm that this works. For the X7SPA-H-D525-0 you can use the
Corsair CMSO2GX3M1A1333C9, and luckily it is widely available.</p>
<p>I figured I would just post this, in case anyone else was planning to get that
motherboard...</p>Switched from Wordpress to Pelican2014-01-03T00:00:00+01:002014-01-03T00:00:00+01:00Taco Steemerstag:tacosteemers.com,2014-01-03:/articles/2014-01-03-switched-from-wordpress-to-pelican.html<p>Over the past two years I've posted some writing on a different domain. That is a shared Wordpress site which hasn't
been seeing much love (the other party hasn't made any posts). I've been thinking of also writing some smaller notes,
things that I tend to forget and then have …</p><p>Over the past two years I've posted some writing on a different domain. That is a shared Wordpress site which hasn't
been seeing much love (the other party hasn't made any posts). I've been thinking of also writing some smaller notes,
things that I tend to forget and then have to find out again. I'd rather not post them on this shared site because of
the constant need for Wordpress updates, and because I tend to find Wordpress to be too much of a hassle when you want
to make small adjustments. Therefore I decided to use a different domain and add a static site there (that is, here,
where you are right now). </p>
<p>This site uses <a href="http://getpelican.com">Pelican</a> to create a static site from files written in
<a href="http://daringfireball.net/projects/markdown">Markdown</a>. I've used Thomas Frössman's
<a href="https://github.com/thomasf/exitwp">exitwp</a> tool to get my posts from TophatCoders, the Wordpress site, in to text
files with the Markdown markup. This worked quite well. Exitwp, however, is targeted at
<a href="https://github.com/jekyll/jekyll/">Jekyll</a> rather than Pelican. To make sure Pelican could read the files, I
wrote <a href="https://tacosteemers.com/articles/exitwpToPelican.py">a small script</a> (Python 2) to remove two lines and remove the
time from the date, as Pelican didn't support the time notation that was written to the files. Of course, I only found
out afterwards that there is a file in the Pelican project that seems to do the same
(<a href="https://github.com/getpelican/pelican/blob/master/pelican/tools/pelican_import.py">pelican/tools/pelican_import.py</a>.
Because I haven't used it, I don't know how well it works, but such functionality also appears to be mentioned in
the manual so it should be supported. Once again it is shown that reading the manual can be a good idea ;).</p>
<p>I must say I'm very pleased with how easy the entire process has been. I'm not very up to date on Pelican yet, and I
haven't gotten around to adding comment functionality yet, but I am liking things so far. </p>
<p>There was only one problem with my workflow, accented characters:</p>
<p><code>WARNING: Could not process Notes/2014-01-03-switched-from-wordpress-to-pelican.markdown
'utf8' codec can't decode byte 0xf6 in position 967: invalid start byte</code></p>
<p>To my surprise, it choked on Thomas Frössman's name. I was typing the post on a Windows laptop and had saved that post
to disk on a Debian server using <code>nano</code>, after pasting the characters in to <code>nano</code> over a PuTTY SSH connection. PuTTY
didn't use the UTF8 encoding that was expected by Pelican. The fix was simple, set the
<code>Window->Translation->Character set translation->Remote character set</code> setting to UTF8.</p>
<p>Now I have everything the way I want it. I have a low-maintenance site and I can add posts with all my devices.</p>Something to think about when storing or processing files in your web app2013-05-18T00:00:00+02:002013-05-18T00:00:00+02:00Taco Steemerstag:tacosteemers.com,2013-05-18:/articles/2013-05-18-something-to-think-about-when-storing-or-processing-files-in-your-web-app.html<p>Recently I saw code that looked like the following:</p>
<div class="highlight"><pre><span></span><code><span class="k">function</span> <span class="nf">handleFileRequest</span>
<span class="n">variable</span> <span class="s">file</span> <span class="s">=</span> <span class="s">getFile('storage/${params.filename}')</span>
<span class="n">send</span> <span class="s">file</span>
</code></pre></div>
<p>This code readily accepts a user-supplied piece of information to retrieve a file.
This is very wrong, luckily it was caught in a code review.</p>
<p>It is wrong because it allows …</p><p>Recently I saw code that looked like the following:</p>
<div class="highlight"><pre><span></span><code><span class="k">function</span> <span class="nf">handleFileRequest</span>
<span class="n">variable</span> <span class="s">file</span> <span class="s">=</span> <span class="s">getFile('storage/${params.filename}')</span>
<span class="n">send</span> <span class="s">file</span>
</code></pre></div>
<p>This code readily accepts a user-supplied piece of information to retrieve a file.
This is very wrong, luckily it was caught in a code review.</p>
<p>It is wrong because it allows for <a href="https://en.wikipedia.org/wiki/Directory_traversal">directory traversal</a>, and
<a href="http://projects.webappsec.org/w/page/13246952/Path%20Traversal">directory traversal is dangerous</a>.</p>
<p>It is natural to think something along the lines of 'Hmm, simply removing all instances of "..", and maybe "/", would
be a good start!'. But that wouldn't solve much, it would solve just this one instance where an attacker would try
something like "../private_file" as an input.</p>
<p>Depending on what kind of tools you are using or could potentially use, you can likely find solutions that are better
thought-through and likely less complex than re-inventing a solution for your particular case. Use the tools your
framework provides for sanitation. If you don't use a framework, or it doesn't supply such a tool, find such a tool.</p>
<p>It would be better to just not be passing identifiers that are also file or directory names. It is likely that there
are better solutions to be found, and in your case too. This is comparable to how most developers have moved on from
building SQL queries by concatenating user-supplied input with function and parameter names, to parametrized queries
and stored procedures.</p>
<p>It is always a good idea to use more than one security layer. Your web application likely runs on an operating system
that has built-in security features such as file access control, and your application/web server
(i.e. Tomcat or Apache) likely does too. So if possible, make sure to use proper access control! If you fetch files
using a different process than your main process, perhaps that script can be run with a user that only has access to
a particular directory. In most cases that won't be enough, because users should only be permitted to access their
'own' files. Information like that can be kept track of in a database. These two suggestions have different goals:
one is aimed at protecting the web application from the user, the other is aimed at protecting users from each other.</p>
<p>It could be a good idea to ask yourself, for your particular applications, who or what needs protecting from whom.
And could it be possible that technology you are already using has a ready-made solution?</p>Weak references, are you sure you want to use them?2013-05-09T00:00:00+02:002013-05-09T00:00:00+02:00Taco Steemerstag:tacosteemers.com,2013-05-09:/articles/2013-05-09-weak-references-are-you-sure-you-want-to-use-them.html<p>One of the projects that I have been working on lately is a standard C# codebase, a framework of sorts, for a
particular niche category of software. This is for a client that develops and uses a lot of this kind of applications
in-house. Many of these have been written …</p><p>One of the projects that I have been working on lately is a standard C# codebase, a framework of sorts, for a
particular niche category of software. This is for a client that develops and uses a lot of this kind of applications
in-house. Many of these have been written as a stand-alone effort. Some are difficult to maintain and extend, as each
development effort followed it's own path.</p>
<p>I have looked at the <a href="http://www.galasoft.ch/mvvm/">MVVM Light Toolkit</a> to see if it could be of use to us. My test
application would show odd behaviour. After a little while, an exception would show up when I clicked a button. The
root cause is a series of null-checks in the toolkit, that don't cover all cases. Also of interest to me was how my
code led to that bug showing up.</p>
<p>I found that the target of a reference that I had passed to a MVVM Light Messenger object had become garbage collected.
A Messenger object can be used for communication between viewmodels. Internally, the Messenger object held a
<a href="http://msdn.microsoft.com/en-us/library/system.weakreference.aspx">weak reference</a> to the object I had passed in.
In my test application, that object went out of scope almost immediately. Because there only existed a weak reference
to it, the garbage collecter removed it. Subsequently, the Messenger object's weak reference no longer had a target,
i.e., it pointed to an object that did not exist anymore.</p>
<!-- more -->
<p>I had thought that the Messenger's reference to my object would keep my object from being gc'ed, but because some of
the Messenger's internal references are a WeakReference, it does not stop those objects from becoming eligible for
garbage collection.</p>
<p>As it turns out, this has been listed as a bug on the <a href="https://mvvmlight.codeplex.com/workitem/7579">MVVM Light Toolkit's CodePlex repository</a>.</p>
<p>This is a simple example of how one might encounter this situation (copied from a post on that same page):
``</p>
<div class="highlight"><pre><span></span><code><span class="k">public</span> <span class="n">MainWindow</span><span class="p">()</span>
<span class="err">{</span>
<span class="n">InitializeComponent</span><span class="p">();</span>
<span class="n">LogManager</span> <span class="n">_log</span> <span class="o">=</span> <span class="k">new</span> <span class="n">LogManager</span><span class="p">(</span><span class="n">typeof</span><span class="p">(</span><span class="n">MainWindow</span><span class="p">).</span><span class="n">Name</span><span class="p">);</span>
<span class="n">Messenger</span><span class="p">.</span><span class="k">Default</span><span class="p">.</span><span class="n">Register</span><span class="o"><</span><span class="n">NotificationMessage</span><span class="o">></span><span class="p">(</span><span class="n">this</span><span class="p">,</span>
<span class="n">x</span> <span class="o">=></span>
<span class="err">{</span>
<span class="o">//</span> <span class="k">call</span> <span class="k">to</span> <span class="n">this</span> <span class="n">fails</span> <span class="k">on</span> <span class="p">.</span><span class="n">Send</span> <span class="n">unless</span> <span class="n">you</span> <span class="n">remove</span> <span class="n">the</span> <span class="k">local</span> <span class="n">_log</span> <span class="n">reference</span>
<span class="o">//</span> <span class="k">or</span> <span class="n">change</span> <span class="n">the</span> <span class="k">local</span> <span class="k">variable</span> <span class="k">to</span> <span class="n">being</span> <span class="n">a</span> <span class="n">field</span> <span class="k">of</span> <span class="n">the</span>
<span class="o">//</span> <span class="n">MainWindow</span> <span class="k">class</span> <span class="k">instead</span> <span class="n">was</span> <span class="n">a</span> <span class="k">simple</span> <span class="n">workaround</span> <span class="k">for</span> <span class="n">me</span>
<span class="n">_log</span><span class="p">.</span><span class="n">Error</span><span class="p">(</span><span class="ss">"An unexpected error occured. "</span> <span class="o">+</span> <span class="n">x</span><span class="p">.</span><span class="n">Notification</span><span class="p">,</span> <span class="n">x</span><span class="p">.</span><span class="n">Content</span><span class="p">);</span>
<span class="err">}</span><span class="p">);</span>
<span class="err">}</span>
</code></pre></div>
<p>If I recall correctly, my case was different. I passed the object as the first variable, the recipient.</p>
<p>I'm not certain that this should be classified as a bug in the MVVM Light Toolkit, as this is probably by design.
It makes sense to use weak references for event-related things because one might forget to unsubscribe ones objects,
a situation which I would consider a memory leak. Then again, why would you want that for the Messenger object,
typically used by viewmodels which exist during the entire lifetime of an application?</p>
<p>I'm not making use of the MVVM light framework anymore, but because initially it looked as though I would, I had to
do something with this situation. I can't give my coworkers a codebase that makes it that easy to create bugs that
are easy to miss. A bug like this is easy to miss during testing because it occurs intermittently. Many developers
have a fuzzy understanding of how memory management works (in general, or in whatever language or runtime they end up
using today or tomorrow), which could make this kind of situation difficult to fix.</p>
<p>I set out to find a way to be able to use the MVVM Light Toolkit code to keep the relevant objects alive longer, and
settled on using a <a href="http://msdn.microsoft.com/en-us/library/dd287757.aspx">ConditionalWeakTable</a>. The way the adjusted
Messenger makes use of it ensures that the targets of the weak references do not get gc'ed, but the entry itself will be
removed and gc'ed when it is no longer relevant to the Messenger. This means that the rest of the code does not need
changing. Unfortunately, this type of collection is only available starting from C# 4.0.</p>
<p>I haven't proposed a patch to the Toolkit's codebase. In this instance, my own preference simply appears to be different
than those of the Toolkit's author. The Toolkit isn't meant to help develop intense event-based systems, it is intended
for user-facing systems with a GUI.
If you are not writing real-time systems, are you sure you need to use a weak reference?</p>Why I like the filesystem as an interface to the OS2013-04-21T00:00:00+02:002013-04-21T00:00:00+02:00Taco Steemerstag:tacosteemers.com,2013-04-21:/articles/2013-04-21-why-i-like-the-filesystem-as-an-interface-to-the-os.html<p>A short while ago I made a post about
<a href="https://tacosteemers.com/pages/usbud.html">a daemon that I have developed</a>.
This daemon needs information about connected storage devices and their mounted partitions.
There is currently only a Linux version. I tried to develop a Windows version too (in the same code base even)
but that …</p><p>A short while ago I made a post about
<a href="https://tacosteemers.com/pages/usbud.html">a daemon that I have developed</a>.
This daemon needs information about connected storage devices and their mounted partitions.
There is currently only a Linux version. I tried to develop a Windows version too (in the same code base even)
but that effort stranded at some point. Developing the daemon for Linux was easy. It simply required reading the
correct files from <code>/proc</code>, <code>/sys</code> and <code>/dev</code>.</p>
<p>Initially I tried to develop both versions at the same time. I looked at what daemons on Linux and services on
Windows generally look like. I took a look at getting a small piece of code that could fork on both platforms.
Then I looked at accessing information about hardware. Using C on Windows, things look to be massively less simple,
and less easy as well.
<a href="http://msdn.microsoft.com/en-us/library/windows/desktop/aa394582%28v=vs.85%29.aspx">Windows Management Instrumentation</a>
seems most promissing, but (parts of) it appear(s) to
<a href="http://msdn.microsoft.com/en-us/library/windows/desktop/aa392726%28v=vs.85%29.aspx">need to be installed on some clients</a>,
and the amount of code required to retrieve the needed information seems massive. Using C#'s libraries, the daemon's
functionality looks easy to implement on Windows. Maybe a bit easier and less complex than it was using C on Linux,
even. But C# also requires the installation of a runtime, something I would really like to avoid.</p>
<p>Everyone that can program can write a program that reads files from the file system, and they can educate themselves
about which files they need simply by reading them. But as soon as you have to use complex series of commands that are
not well documented and may not (easily) be available to your programming language, development- or client environment,
educating oneself becomes less easy and developing your program becomes more complex.</p>USB Storage Back Up Daemon2013-04-13T00:00:00+02:002013-04-13T00:00:00+02:00Taco Steemerstag:tacosteemers.com,2013-04-13:/articles/2013-04-13-usb-storage-back-up-daemon.html<p>I am working an a daemon (service) that will
<a href="https://github.com/TacoSteemers/usbud">automatically back up all mounted partitions on USB storage devices</a>,
with or without prior configuration for a particular drive. In it's simplest form, the daemon will back up any USB
storage device that you plug in. This includes thumb drives and …</p><p>I am working an a daemon (service) that will
<a href="https://github.com/TacoSteemers/usbud">automatically back up all mounted partitions on USB storage devices</a>,
with or without prior configuration for a particular drive. In it's simplest form, the daemon will back up any USB
storage device that you plug in. This includes thumb drives and all sorts of camera storage, such as SD and SDHC cards.</p>
<p>All desired functionality is in place, including white- and blacklist functionality. Additional functionality may be
added in the future, and some ideas for that can be found in
<a href="https://github.com/TacoSteemers/usbud/blob/master/README.md">the readme file</a>.</p>
<p>The software is known to work on the Ubuntu GNU/Linux distribution. It doesn't work on any Windows OS, but I'm working
on a project specifically for versions Vista and 7.</p>A small script to help organize your torrent downloads2013-03-19T00:00:00+01:002013-03-19T00:00:00+01:00Taco Steemerstag:tacosteemers.com,2013-03-19:/articles/2013-03-19-liststaletorrentdata-a-small-script-to-help-organize-your-torrent-downloads.html<p>A couple of days ago I wrote a small housekeeping script (in Python), that
<a href="https://github.com/TacoSteemers/listStaleTorrentData">lists stale torrent data</a>. I use this to help clean up the
directory that I let my torrent application store the torrent data to. The script will list those files that are not
part of the …</p><p>A couple of days ago I wrote a small housekeeping script (in Python), that
<a href="https://github.com/TacoSteemers/listStaleTorrentData">lists stale torrent data</a>. I use this to help clean up the
directory that I let my torrent application store the torrent data to. The script will list those files that are not
part of the torrents loaded by a torrent application, but do occur in the given download directory. I find this useful
to keep track of which files I am still up- or downloading, and which I am not. Those files I might decide to remove,
or move to a different directory. </p>
<!-- more -->
<p>Finding out which torrents are in use by the torrent application turned out to not be that difficult for Transmission
and Deluge, as they keep a directory with current torrents. They do this regardless of if you used a magnet link or
an actual torrent file. To find out which files belong to torrent files, one will need to read the torrent files.
As is shown on <a href="http://www.bittorrent.org/beps/bep_0003.html">the bittorrent.org website</a>, torrent files use a specific
encoding, which is called bencoding. As it turns out,
<a href="http://effbot.org/zone/bencode.htm">Fredrik Lundh published a decoder</a> in August 2007, which was very useful to me.</p>
<p>My script will list each file, with their full paths. Example usage:
<code>python listStaleTorrentData.py /home/user/.config/deluge/state /home/user/downloads/
python listStaleTorrentData.py /home/user/.config/transmission/torrents/ /home/user/downloads/</code>
The script will work with any torrent application that uses a directory with torrent files to store it's state. Note
that only the one torrent application should be using the download folder to store files, because only that
application's known torrents will be checked.</p>
<p>If you wish to pass this list to a command, such as the <code>rm</code> command on your UNIX(-like) operating system, you may have
to tweak the output a bit. If the resulting files don't contain spaces and only contain those characters
in <a href="http://en.wikipedia.org/wiki/C0_Controls_and_Basic_Latin">the basic Latin block</a> that are allowed by your
filesystem, you can probably pass the output to <code>rm</code> by doing something like <code>| xargs rm</code>, provided that your
platform has a utility such as <code>xargs</code>. If your files do contain spaces, you will have to tweak the output such that
quotes are added.
<a href="http://unix.stackexchange.com/questions/65584/how-to-execute-command-on-list-of-file-names-in-a-file">Stack exchange has you covered on that front</a>.</p>
<p>You may notice on that page that the accepted answer uses a null character as a filename separator. Not every script or
application accepts each character as a seperator, but there is a good reason to use that separator if you can. If you are using <code>rm</code> and <code>xargs</code> you can separate filenames by a null character. This will prevent one silly but dangerous vector of attack, which is filenames with a newline in them. <code>rm</code> and <code>xargs</code> have commandline arguments that you can use to indicate that their input is null separated data. The null character can never appear in filenames (at least not to my knowledge).</p>
<p>Regarding filenames with a newline in them, take a look at the following to see why they might be problematic
(use a fresh, empty directory!):
<code>$ touch 'a
b'
$ touch 'b'
$ ls
a?b b
$ ls | xargs rm
rm: cannot remove</code>a': No such file or directory
rm: cannot remove <code>b': No such file or directory
$ ls
a?b</code>
The <code>rm</code> command tried to remove the files <code>a</code> and <code>b</code>, after encountering <code>a[newline]b</code>. Of course <code>a</code> never existed,
but <code>b</code> did, and it got removed. If the characteres after the newline are a valid path, that file might be deleted if
the file permissions allow it.</p>Setting up network shares with NFS and Linux systems2013-02-06T00:00:00+01:002013-02-06T00:00:00+01:00Taco Steemerstag:tacosteemers.com,2013-02-06:/articles/2013-02-06-setting-up-network-shares-with-nfs-and-linux-systems.html<p>For quite a while, I have had a backup solution that I was not happy with. I have now remedied this. I am using a
small server, a HP Proliant Microserver N40L. I tend to use Debian-based GNU/Linux systems at home. While I do
eventually want my new solution …</p><p>For quite a while, I have had a backup solution that I was not happy with. I have now remedied this. I am using a
small server, a HP Proliant Microserver N40L. I tend to use Debian-based GNU/Linux systems at home. While I do
eventually want my new solution to support Windows systems, it is not a priority. To support both Windows and Linux
operating systems, I would use Samba. However, I have had mixed results with Samba, in home and small business settings.
I am now succesfully using NFS (Network Filesystem Share) to back up computers with Linux on it. Unfortunately,
NFS support under Windows isn't great. Apparently it is only supported by several Microsoft Windows Vista, Server
2008, Windows 7 Enterprise and Windows 7 Ultimate
(<a href="http://www.microsoft.com/en-us/download/details.aspx?id=2391">http://www.microsoft.com/en-us/download/details.aspx?id=2391</a>).
In this post I will outline how to install NFS under Debian-based operating systems, as a server and as a client.
As it turns out, this is pretty simple.</p>
<!-- more -->
<p><strong>The server</strong>
We will be using a volume that is mounted under <code>/media/backupspace</code>.</p>
<p>First, we will install the required software (I'm using apt and sudo to get the necessary packages and rights,
just substitute as necessary).
<code>sudo apt-get install nfs-kernel-server</code>
This install should make sure that everything required for NFS is loaded at startup.</p>
<p>Now we will mount this same volume in a directory that will be used by NFS.
First, we create the new mountpoint
<code>sudo mkdir -p /exports/backupspace</code>
and then we bind the volume, the one that we wish to place the backups on, to that new mountpoint. We can do this
manually with
<code>sudo mount --bind /media/backupspace /exports/backupspace</code>
To have it be mounted during startup, we can add it to the <code>/etc/fstab</code> file. This file can only be edited by a
superuser. We would add the following line:
<code>/media/backupspace /exports/backupspace none bind 0 0</code></p>
<p>Three other files are relevant to our configuration:
<code>/etc/default/nfs-kernel-server
/etc/default/nfs-common
/etc/exports</code>
These three files can also only be edited by a superuser.</p>
<p>For this example we will not be setting up authentication requirements. You may wish to do so yourself, perhaps
after you have gotten your share accessible without authentication.
In <code>/etc/default/nfs-kernel-server</code> we will be setting <code>NEED_SVCGSSD=no</code>. This indicates that we don't need authentication.</p>
<p>In <code>/etc/default/nfs-common</code> we set <code>NEED_GSSD=no</code> because we are not using authentication, and <code>NEED_IDMAPD=yes</code>
because we want to map user IDs from names (for properly preserving permissions). For that to work, 'idmapd' should be
running and configured properly, something that it probably does by default.</p>
<p>In <code>/etc/exports</code> we will indicate that we want to use our newly created /exports/backupspace. We add something like
<code>/exports/backupspace 10.0.0.0/24(rw,fsid=0,insecure,no_subtree_check,async)</code>
This should share /exports/backupspace, to the indicated subnet, with fairly standard settings and no authentication.</p>
<p>Restart the NFS service for the changes to have effect (<code>/etc/init.d/nfs-kernel-server restart</code> should do it).</p>
<p><strong>The client</strong>
First, we need to determine where we will mount our network share (i.e., what path we will be using to access our files).
This can't be done in an encrypted directory. I suggest using the location <code>/media</code> to create your mountpoint in, as
this is the directory that storage devices are mounted to by the operating system itself. For this example I will be
using <code>/media/backupspace</code> to mount my networked storage.
<code>sudo mkdir /media/backupspace</code></p>
<p>We install the client software:
<code>apt-get install nfs-common</code></p>
<p>We can manually mount our network share using the mount command:
<code>sudo mount -t nfs4 -o proto=tcp,port=2049 backupserver:backupspace /media/backupspace</code>
"-t nts4" indicates that we want to access something that is shared over nfs4.
"-o proto=tcp,port=2049" are the default connection settings for the NFS service.
"backupserver:backupspace" indicates which server and which share we will be using. Note that we do not list the
server-side path here.
The last part, "/media/backupspace", is the mountpoint on our client.</p>
<p>If we want to have the networked share to be mounted on boot we can use fstab, as shown earlier.
We will use the flags "hard,intr". To quote the NFS how-to on the "hard" setting:</p>
<blockquote>
<p>"The program accessing a file on a NFS mounted file system will hang when the server crashes. The process cannot be interrupted or killed (except by a "sure kill") unless you also specify intr. When the NFS server is back online the program will continue undisturbed from where it was. We recommend using hard,intr on all NFS mounted file systems."</p>
</blockquote>
<p>(<a href="http://nfs.sourceforge.net/nfs-howto/ar01s04.html">http://nfs.sourceforge.net/nfs-howto/ar01s04.html</a>)
Since this is a backup system, it is crucial that we do not use the "soft" setting, which will lead to data corruption
in such situations.</p>
<p>We will add something like the following to fstab:
<code>backupserver:backupspace /media/backupspace nfs rw,hard,intr 0 0</code>
On my backup server I also have a share that is rarely written to. That share is itself not a backup, though a backup
is created of it. Currently I mount it manually, but if I were to add it to fstab, I would have it mounted as read-only
("ro,hard,intr").</p>
<p><strong>Some notes</strong>
It is worth spending some time thinking about NFS security (<a href="http://nfs.sourceforge.net/nfs-howto/ar01s04.html">http://nfs.sourceforge.net/nfs-howto/ar01s04.html</a>).
It is probably a wise idea to set up a firewall. You may wish to use a 'defense in depth' approach, and set up a
firewall both for your network but also specifically on your NFS server machine.
Currently I am using the <code>rsync</code> command to duplicate my files. This works fine. One downside is that there is no
solid, pre-built way to handle filename changes. This isn't really surprising, without storing metadata there is no
way to determine that a path has been adjusted. Instead, it concludes that items have been removed and that items have
been added, and starts doing the same to the target location.
If you are copying to an NTFS formatted target then it is important to know that NTFS doesn't support permissions.</p>Minimal Linux and Windows process spawn test2012-10-23T00:00:00+02:002012-10-23T00:00:00+02:00Taco Steemerstag:tacosteemers.com,2012-10-23:/articles/2012-10-23-minimal-linux-and-windows-process-spawn-test.html<p>I am working on a small backup utility meant to add value to other, proper backup utilities. At the least, I want this
program to run on any vanilla install of a (semi-) recent Linux or Windows desktop version. I would also like to keep
a small file-size, and prefer …</p><p>I am working on a small backup utility meant to add value to other, proper backup utilities. At the least, I want this
program to run on any vanilla install of a (semi-) recent Linux or Windows desktop version. I would also like to keep
a small file-size, and prefer not to depend on any non-standard libraries. The language of choice is C. The planned
functionality of this program requires spawning different processes, without destroying the original process.</p>
<!-- more -->
<p>For that reason, I have looked in to writing the simplest cross-platform way of doing so. Before I started to look for
information on launching new processes in C, running on a Linux and/or Windows OS, I had expected to find several
examples of possible approaches. This turned out not to be the case, and so I had the pleasure of finding my own
solution. Naturally, it was only after finishing my code sample that I stumbled upon the Wikipedia page
for <a href="https://secure.wikimedia.org/wikipedia/en/wiki/Spawn_%28computing%29">"Spawn (computing)"</a>, which contains a lot
of useful information.</p>
<p>For spawning a different process that runs beside the current process,
the <a href="http://pubs.opengroup.org/onlinepubs/9699919799/">fork function</a> in combination with a function from
the <a href="http://pubs.opengroup.org/onlinepubs/009604499/functions/exec.html">exec family</a> appears to be the standard on
Linux operating systems. fork() duplicates the running process, but execv() loads a different process image into the
duplicate process. My preference to not depend on any non-standard libraries excludes a solution such as
the <a href="http://www.cygwin.com/">Cygwin dll</a>, which I am told would support fork() under Windows. Instead, I wrote some
platform specific code that makes use of
the <a href="http://msdn.microsoft.com/en-us/library/20y988d2%28v=vs.71%29.aspx">_spawnv</a> function. If you are not familiar
with it yet, be sure to read that page before using it. The page contains important information on the environment
of the spawned process. The second and third arguments of execv() and _spawnv() respectively, are the arguments (argv)
for the new program. That is what the v in the function names refers to.</p>
<p>I have probably overlooked something that a practiced C programmer would not. If you find anything to improve,
feel free to get in contact or fork <a href="https://gist.github.com/3940934">the github gist</a>.</p>
<div class="highlight"><pre><span></span><code><span class="cp">#include</span> <span class="cpf"><stdio.h></span><span class="cp"></span>
<span class="cp">#include</span> <span class="cpf"><string.h></span><span class="cp"></span>
<span class="cp">#include</span> <span class="cpf"><stdlib.h></span><span class="cp"></span>
<span class="cp">#ifdef _WIN32</span>
<span class="cp">#include</span> <span class="cpf"><process.h> /* Required for _spawnv */</span><span class="cp"></span>
<span class="cp">#include</span> <span class="cpf"><windows.h></span><span class="cp"></span>
<span class="cm">/* We make getpid() work in a similar </span>
<span class="cm"> way on Windows as it does on Linux */</span>
<span class="cp">#define getpid() GetCurrentProcessId()</span>
<span class="cp">#endif</span>
<span class="cp">#ifdef __linux__</span>
<span class="cp">#include</span> <span class="cpf"><unistd.h></span><span class="cp"></span>
<span class="cp">#endif</span>
<span class="kt">void</span> <span class="nf">spawn_new_process</span><span class="p">(</span><span class="kt">char</span> <span class="o">*</span> <span class="k">const</span> <span class="o">*</span><span class="n">argv</span><span class="p">);</span>
<span class="kt">int</span> <span class="n">pid</span><span class="p">;</span>
<span class="kt">int</span> <span class="nf">main</span><span class="p">(</span><span class="kt">int</span> <span class="n">argc</span><span class="p">,</span> <span class="kt">char</span> <span class="o">*</span><span class="n">argv</span><span class="p">[])</span>
<span class="p">{</span>
<span class="n">pid</span> <span class="o">=</span> <span class="n">getpid</span><span class="p">();</span>
<span class="k">if</span><span class="p">(</span><span class="n">argc</span> <span class="o">></span> <span class="mi">1</span> <span class="o">&&</span> <span class="n">strcmp</span><span class="p">(</span><span class="n">argv</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span><span class="s">"the_new_process"</span><span class="p">)</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span>
<span class="p">{</span>
<span class="n">printf</span><span class="p">(</span><span class="s">"[%d] This is a new process, and not a fork.</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">pid</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">else</span>
<span class="p">{</span>
<span class="n">printf</span><span class="p">(</span><span class="s">"[%d] This is the original process.</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">pid</span><span class="p">);</span>
<span class="kt">char</span> <span class="o">*</span><span class="n">new_args</span><span class="p">[</span><span class="mi">2</span><span class="p">];</span>
<span class="n">new_args</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">=</span> <span class="n">argv</span><span class="p">[</span><span class="mi">0</span><span class="p">];</span>
<span class="n">new_args</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">=</span> <span class="s">"the_new_process"</span><span class="p">;</span>
<span class="n">spawn_new_process</span><span class="p">((</span><span class="kt">char</span> <span class="o">*</span> <span class="k">const</span> <span class="o">*</span><span class="p">)</span><span class="n">new_args</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">return</span><span class="p">(</span><span class="mi">0</span><span class="p">);</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="nf">spawn_new_process</span><span class="p">(</span><span class="kt">char</span> <span class="o">*</span> <span class="k">const</span> <span class="o">*</span><span class="n">argv</span><span class="p">)</span>
<span class="p">{</span>
<span class="cp">#ifdef _WIN32</span>
<span class="cm">/* This code block will also be reached on a </span>
<span class="cm"> 64 bit version of a Windows desktop OS */</span>
<span class="n">_spawnv</span><span class="p">(</span><span class="n">_P_NOWAIT</span><span class="p">,</span> <span class="n">argv</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="p">(</span><span class="k">const</span> <span class="kt">char</span> <span class="o">*</span> <span class="k">const</span> <span class="o">*</span><span class="p">)</span><span class="n">argv</span><span class="p">);</span>
<span class="cp">#endif</span>
<span class="cp">#ifdef __linux__</span>
<span class="n">pid</span> <span class="o">=</span> <span class="n">getpid</span><span class="p">();</span>
<span class="cm">/* Create copy of current process */</span>
<span class="n">pid</span> <span class="o">=</span> <span class="n">fork</span><span class="p">();</span>
<span class="cm">/* The parent`s new pid will be 0 */</span>
<span class="k">if</span><span class="p">(</span><span class="n">pid</span> <span class="o">!=</span> <span class="mi">0</span><span class="p">)</span>
<span class="p">{</span>
<span class="cm">/* We are now in a child progress </span>
<span class="cm"> Execute different process */</span>
<span class="n">printf</span><span class="p">(</span><span class="s">"[%d] Child (fork) process will call exec.</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span>
<span class="n">pid</span><span class="p">);</span>
<span class="n">execv</span><span class="p">(</span><span class="n">argv</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="n">argv</span><span class="p">);</span>
<span class="cm">/* This code will never be executed */</span>
<span class="n">printf</span><span class="p">(</span><span class="s">"[%d] Child (fork) process is exiting.</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">pid</span><span class="p">);</span>
<span class="n">exit</span><span class="p">(</span><span class="n">EXIT_SUCCESS</span><span class="p">);</span>
<span class="p">}</span>
<span class="cp">#endif</span>
<span class="cm">/* We are still in the original process */</span>
<span class="n">printf</span><span class="p">(</span><span class="s">"[%d] Original process is exiting.</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">pid</span><span class="p">);</span>
<span class="n">exit</span><span class="p">(</span><span class="n">EXIT_SUCCESS</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div>
<p>When run on Linux, we get the following output:</p>
<div class="highlight"><pre><span></span><code><span class="err">prompt> minimal_fork_test</span>
<span class="err">[22166] This is the original process.</span>
<span class="err">[22167] Child (fork) process will call exec.</span>
<span class="err">[0] Original process is exiting.</span>
<span class="err">[22166] This is a new process, and not a fork.</span>
<span class="err">prompt></span>
</code></pre></div>
<p>When run under Windows, we get the following output:</p>
<div class="highlight"><pre><span></span><code><span class="n">prompt</span><span class="o">></span><span class="n">minimal_fork_test</span>
<span class="p">[</span><span class="mi">14800</span><span class="p">]</span> <span class="n">This</span> <span class="k">is</span> <span class="n">the</span> <span class="n">original</span> <span class="n">process</span><span class="p">.</span>
<span class="p">[</span><span class="mi">14800</span><span class="p">]</span> <span class="n">Original</span> <span class="n">process</span> <span class="k">is</span> <span class="n">exiting</span><span class="p">.</span>
<span class="n">prompt</span><span class="o">></span><span class="p">[</span><span class="mi">14808</span><span class="p">]</span> <span class="n">This</span> <span class="k">is</span> <span class="n">a</span> <span class="k">new</span> <span class="n">process</span><span class="p">,</span> <span class="k">and</span> <span class="k">not</span> <span class="n">a</span> <span class="n">fork</span><span class="p">.</span>
</code></pre></div>