Love to see projects that focus on the "boring" parts of getting infrastructure up and running. Something like this can save a bundle of time. This is a great project with a great team behind it!
Decent overview, but, remember that when evaluating something, projects change over time. Even better, you can be the change you want to see! Thus far, the grpc project has been fairly responsive in making solid changes, either through PRs or filing issues. ;)
There is no real advantage to a dependency graph versus a list of layers. Originally, Docker images were actually fetched in sequence, similar to the ACI approach, but this blocked parallelization on each subsequent step. What has become OCI uses a list of layers to avoid this bottleneck.
As long as images share layers, network traffic would be the same in either scenario.
Reduced transfers will be achieved by formats that have a more granular filesystem representation. Right now, we are really held back by the continued use of tar, but that is another problem.
Yes, I agree that the method that OCI v1.0 & Docker v2.2 has of telling you ahead of time "you must fetch all of these objects and they will be this size" in a compact single metadata only format is the best option. Then it is up to the fetching code to choose the best way to pipeline or parallelize.
I think that lack of parallelism has almost nothing to do with graphs vs lists. It would be pretty easy to either store the entire dependency graph's fetching-critical metadata in each image or to make it possible to fetch the metadata separately from the core data. Both of those would allow you to parallelize downloading of image data in the graph-based system.
Having more granular layers can help a bit but in practice it hits its limits very quickly because higher layers will necessarily get swept up and have to be rebuilt when layers below them that they don't actually depend on get rebuilt. This rules out having a core image and a set of modules that each downstream image may or may not need (to be mixed in or not per-image). Thus, when you use the layer-based system in practice you usually end up with very little layer sharing between images, outside of the core Linux distro layers.
Note that I did not say "more granular layers" but more "more granular filesystem representation". This is a key point in achieving better data sharing. In fact, these use cases are not ruled out, but bolstered by OCIs approach. The introduction of alternative systems will be much easier to do when OCI is broadly implemented.
As far as metadata is concerned, your suggestion is exactly what OCI does. The problem with ACI is that it embeds a large part of the metadata into a tar file, which has to be fetched in its entirety. OCI is mostly metadata scaffolding, made up of indexes, manifests and configs that can all be fetched without large bandwidth requirements.
The compositional aspect that you've brought up has been explored and it doesn't make a whole lot of sense to cram that into images. Typically, such a system requires composing container filesystems through named references, allowing components to be independently rebuilt. Because this composition often relies on details of the target deployment media (orchestration system, container runtime, specific operating system, etc.), especially in how it deals with the security of name resolution, baking it into an image format leads to massive inflexibility.
Moving it up the stack works much better and avoid the technical complexities that come with doing it within images. We can see this image composition in action in k8s PODs, docker stacks and other systems. Such constructions can be distributed through OCI images, through media types, but they are based on the compositional capabilities of the target system.
AppC has layers[0]. In fact, an ACI is just a layer.
While layers are codified into the OCI specification, the manifest system is flexible and allows for future extension. The main goal of OCI is to provide a specification for what is already widely deployed without introducing too many new concepts.
Nix has been a very cool project to watch over the years.
You can address part of the problem of picking up extra data in final images by declaring temporary build locations, such as `/var/lib/cache`, as a volume. Anything written to a volume won't be included in the final image.
Not disagreeing outright, but there are studies[1] that link high-poverty areas with higher lead concentrations. There was also an interactive map, some time ago, that showed lead soil levels by zip code and they definitely seemed to match up with more problematic neighborhoods. Clearly, the factors you call out have an influence, but lead seems to be a part of that equation.
(you mean high poverty areas correlated with lead concentration, right?)
My first guess would not be that lead causes poverty (is that what you are suggesting?), but that lead-producing activities (highways, factories, waste dumps) tend to be undesirable, and tend to be placed in poorer neighborhoods because poor people have less political power.
Yes, I do mean areas with high poverty areas or low income.
I am not really suggesting any particular cause of poverty, but lead-soil levels seem to have some negative effect. The more important suggestion is that its not just income disparity, anonymity and juxtaposition of poverty and wealth, if at all, that leads to higher crime rates in cities.
In fact, mixed-income neighborhoods have been linked to better social mobility, but that is getting off-topic.
I guess it's topical if it's about proper conclusions from statistics.
I think your conclusion, from correlation between lead concentration in soils to conclusiong "lead-soil levels seem to have some negative effect" -- is entirely unjustified.
The alternate hypothesis I mentioned seems as, or more, plasuble. That it's not lead-soil levels that are having any effect at all, but rather that lead-producing activities are socially undesirable, and end up in poor neighborhoods because poor people lack political power.
I suppose the scientific method would be to devise an experiment or investigation that would attempt to distinguish between these two hypothesis. Possibly people already have, and arguing about it in the academic literature now.
One danger of concluding causation from correlation in the 'big data' era is how easy it is to go hunting for correlations. If I test ketchup consumption against 1000 other variables, it may be fairly likely that I'll find a correlation against at least one of them. (If I flip a coin 10 times; and then do this experiment 1000 times, it's not unlikely that at least one of those 10-times iterations I'll get 10 ten heads). So maybe I find that ketchup consumption is very closely correlated with public transit availability, across cultures and times and governments. That doesn't really mean that ketchup causes public transit or vice versa, it just means that if you have enough data, you're going to find happenstance correlations.
It seems you are trying hard to find something in my statement with which to refute. I was merely pointing out that the GP's conclusions are probably incomplete, given the data we have on lead.
Most of my previous comments were quite non-committal. For instance, the phrase "lead-soil levels seem to have some negative effect" was cherry-picked for your analysis, and given much stronger meaning than intended. First, let's examine the word choice of "seem". It means "give the impression or sensation of being something or having a particular quality." This modifies the statement, to imply that the data "gives the impression" that there is "some negative effect". This is a much weaker statement than a hypothesis on the effect of lead-soil levels on poverty-stricken neighborhood. Perhaps, that statement would have been more clear if written as, "lead-soil levels seem to have an effect on crime". The implication from the sentence that follows is that the value of "some" is "crime".
It very well could be that lack of political power caused lead-producing activities to be concentrated in low-income neighborhoods. However, the issue at hand is not that lead-soil levels cause poverty; it's that lead-soil levels increase the incidence of crime. This point may not have been clear, as conceded above.
Indeed, correlation does not imply causation, but we know much about the health effects of lead [1] and its impact on decision making. We know that the correlation of crime rates with increasing and decreasing lead-levels, in a variety of situations, throughout many policies and governments, holds. We know both a pathway and have a strong correlation. Dismissing this as happenstance is unwise.
Create a separate ServeMux for your application and let the utilities register to the DefaultServeMux via init (and you can even register your own!). Bind the DefaultServeMux to the debug port (6060) using http.ListenAndServe(debugPort, nil) and bind your app to its port (8080) with http.ListenAndServe(appPort, appMux).