Andy's Math/CS page

Saturday, October 18, 2014

Making academic contacts (some thoughts for new researchers)

Suppose you're an undergraduate hoping to go into academic research, or a beginning grad student.  It could be very helpful to have academic contacts at other schools---such as professors, but also maybe postdocs and grad students. (I'm concerned here with starting scientific discussions and/or collaborations, not making contacts for graduate admissions per se.  Admissions decisions will be made based on your application.  But a successful collaboration and resulting publication, or letter of support, is one of the best things you can hope to add to your applications.)

Say you've never written a paper, and may not yet have the time or readiness to solve major problems on your own; but you want to have real scientific conversations and perhaps even collaboration with a busy researcher you admire.  What could you possibly talk about?  What can you offer them?   Below is my advice.


***

1)  In my view, the best approach to making contacts is closely related to how I'd recommend spending your own study time as a beginning researcher:  (a) start reading research papers as soon as possible, and (b) try diligently to ask interesting follow-up questions to the papers you like.

Asking questions is a central research skill.  It is one you can start learning early on because, even in technically difficult fields, simple patterns of questioning often recur in many forms across papers.  E.g., in the theory of computing, if there is a new model of computation being studied, it is likely to have randomized or nondeterministic versions that can be studied next.  Or if there is an algorithmic problem, it might have variant problems to be solved in, say, the communication protocol or decision-tree models.  And so on.

Even though it is not so difficult to produce these variations, few people do so systematically.  If you keep at it, and keep trying to learn and adopt more patterns of questioning, you will hit upon interesting and solvable problems.  You will also start to develop a taste about which problems are likely to be most interesting, which are too hard, and so on.  (Side note---I believe that keeping a research journal with your questions greatly helps this process.)


2)  Now returning to the goal of making contacts, I think that one of the best kind of emails you can send is one that contains a good research question, that's reasonably related to the person's research area.  If you can do this and pique their interest, they may well enter into a dialogue that could become a full-fledged collaboration---regardless of your credentials on paper.  After all, you've already brought something important to the table.

A key advantage of this approach to making contacts is that you're aiming to attract the researcher's own interest and curiosity, rather than just asking them for something.  Another advantage is that there is a lot of freedom in asking and considering research questions.  If you ask someone an interesting, specific question about topic X, there is no presumption that X is your only interest or that future interactions will be limited to that.  And in the course of your correspondence, you might realize there is a better question to ask about X, or you might getting around to asking something about topic Y as well.  That's how scientific interactions go.

So you don't have to worry about choosing the exact right question to represent yourself, as long as it is good and leads to discussion.  In this respect it's less stressful than trying to introduce yourself by defining your whole research outlook.  And it's certainly more promising than suggesting a collaboration based on your GPA or work experience.

Of course, asking good questions isn't easy and takes work. You should think carefully about them before sending, try your best to answer them yourself, and see if there are initial observations or partial solutions you can provide with your question to show you're serious.   Maybe in the end it will be a question they can answer right away; but again, it could still lead to other questions.  There is little risk in trying and it is likely to at least be a useful exercise.

Wednesday, September 28, 2011

A geometric-graphs offering

Introduction

After a night of board games, I found myself thinking about the peculiarities of movement on discrete versions of the plane. This suggested a number of questions. As they would likely suffer neglect at my hands, I'm posting them here for others to enjoy---any ideas or references are very welcome.

The basic structure that got me thinking was the 8-neighborhood (or Moore) graph:


This graph (which we'll denote by $G_{8N}$) describes how a chess king moves on an infinite chessboard. It's often convenient for game designers, but there's something... wrong about it: it distorts distances in the plane.

To make this formal, let $G$ be a (finite or countably infinite) undirected graph. Let $d_G(u, v)$ denote the shortest-path distance in $G$ between vertices $u, v$.

We say $F: V(G) \rightarrow \mathbf{R}^2$ is an embedding of $G$ if $|| F(u) - F(v)||_2 \geq d_G(u, v)$ for all distinct vertices $u, v$. (This is just a normalization convention.) Define the distortion of $F$ as the maximum (supremum) of
\[ \frac{|| F(u) - F(v) ||_2}{d_G(u, v)} , \quad{} u \neq v . \]

The study of low-distortion embeddings (which can be pursued in a more general setting) has been a highly-active TCS research topic, largely due to its role in designing efficient approximation algorithms for NP-hard problems. My initial focus here will be on embeddings for periodic and highly-symmetric graphs like $G_{8N}$.

As an example, look at the usual embedding of the 8-neighborhood graph into the plane. This has a distortion of $\sqrt{2}$, witnessed by points along a diagonal.

Warm-up: Show that $\sqrt{2}$ is the minimum distortion of any embedding of the 8-neighborhood graph $G_{8N}$.

Symmetries and Distortion

Now a basic observation here is that, when we started with a graph with a high degree of inherent symmetry, we found that its optimal (distortion-minimizing) embedding was also highly-symmetric. I would like to ask whether this is always the case.

For background, an automorphism of $G$ is a bijective mapping $\phi$ from $V(G)$ to itself, such that $(u, v) \in E(G) \Leftrightarrow (\phi(u), \phi(v)) \in E(G)$.
Let's say that a graph $G$ has 2D symmetry if there's an embedding $F$ of $V(G)$ into the plane, and linearly independent vectors $\mathbf{p}, \mathbf{q} \in \mathbf{R}^2$, such that a translation of the plane by $\mathbf{p}$ or by $\mathbf{q}$ induces an automorphism of $G$ (in the obvious way). In this case we also say the embedding $F$ has 2D symmetry.

So for example, with the usual embedding of $G_{8N}$, we can take $\mathbf{p} = (1, 0), \mathbf{q} = (0, 1)$.

Question 1: Suppose $G$ has 2D symmetry. Does this imply that there is a distortion-minimizing embedding of $G$ with 2D symmetry?

Say that $G$ is transitive if "all points look the same:" there's an automorphism of $G$ mapping any vertex $u$ to any other desired vertex $v$. Similarly, say that an embedding $F$ of $G$ is transitive, if translating the plane by any vector of form $(F(u) - F(v))$ induces an automorphism of $G$. (The usual embedding of $G_{8N}$ is transitive.)

Question 2: Suppose $G$ has 2D symmetry and is transitive. Is there a distortion-minimizing, transitive embedding of $G$ with 2D symmetry?

Question 3: Suppose $G$ has 2D symmetry, and is presented to us (in the natural way, by a finite description of a repeating "cell"). What is the complexity of determining the minimum distortion of any embedding of $G$? What about the case where $G$ is also transitive?

It seems clear that the answers to Questions 1 and 2 are highly relevant to Question 3.

Graph Metrics and Movement

I want to shift focus to another type of question suggested by $G_{8N}$. Let's back up a bit, and think about the familiar 4-neighborhood graph $G_{4N}$. It's not hard to see that the minimum-distortion embedding of $G_{4N}$ also has distortion $\sqrt{2}$. (You have to blow up the grid by a $\sqrt{2}$ factor to make the diagonals long enough.) Yet $G_{4N}$ seems considerably more natural as a discrete representation of movement in the plane somehow. Why?

I think the answer is that, with the usual embedding of $G_{4N}$, the graph distances $d_G(u, v)$ correspond to actual Euclidean travel-distances, under the restricted form of paths in which we confine ourselves to the line segments between vertices. (You can see why this metric is sometimes called "taxicab geometry.") By contrast, the usual embedding of $G_{8N}$ doesn't have this interpretation.

However, consider the following system of paths connecting points in the plane:


If we restrict ourselves to these paths, and if we make those squiggles the right length, then shortest Euclidean travel-distances actually do correspond to distances in the graph $G_{8N}$! This is so, even if we're allowed to switch paths at the crossover points.
So $G_{8N}$ is not totally weird as a discrete model of movement in the plane; it just corresponds to an odder restriction of movement.

More generally, say that a graph $G$, with nonnegative edge-weights ("lengths"), is an obstructed-plane graph, if there is an embedding of $G$ into $\mathbf{R}^2$ along with a set of "obstructions" (just a point-set in $\mathbf{R}^2$), such that shortest paths in $G$ correspond to shortest obstruction-avoiding paths in $\mathbf{R}^2$.

Question 4: What is the complexity of deciding whether a given graph (finite, say) is an obstructed-plane graph?

It simplifies things a bit to realize that, in trying to find an obstructed-plane realization of a graph $G$, the obstructions may as well be all of the plane except the intended shortest paths between all pairs of points. Using this observation, we can at least show that our problem is in NP. Is it NP-complete?

Any planar graph, with arbitrary nonnegative edge-weights, is clearly an obstructed-plane graph. But we've seen that $G_{8N}$, a non-planar graph, is also an obstructed-plane graph. (Quick---prove that $G_{8N}$ is non-planar!) The essence of the problem is to find systems of paths in the plane which, though they may cross, do not introduce any undesired "short-cuts" between vertices.

Now suppose we draw $G$ in the plane, along with a collection of "intended" shortest paths between each vertices. (That is, we will obstruct the rest of the plane, and hope that these paths are indeed shortest in what remains.) We expect that the intended $u$-$v$ path is of Euclidean length $d_G(u, v)$.

A simple observation is that in order to avoid short-cuts, all 4-tuples of distinct vertices $u, u', v, v'$ must obey the following property:

$\bullet$ If the "intended path" from $u$ to $v$ intersects the intended path from $u'$ to $v'$, then
\[ d_G(u, v) + d_G(u', v') \geq d_G(u, u') + d_G(v, v') \]
and
\[ d_G(u, v) + d_G(u', v') \geq d_G(u, v') + d_G(u', v) . \]

Question 5: Is the necessary condition above also sufficient?

In a narrow sense, the answer to Question 5 is No: it's possible to draw a graph in this way and still introduce undesired short-cuts. My real question is whether, from a graph drawing with the property above, we can lengthen and contract the lengths of segments, without changing the topological structure of the drawing, in order to get the desired obstructed-plane realization.

It may be foolish to hope for such a simple condition to be sufficient. Also, an affirmative answer to Question 5 wouldn't seem to imply any new complexity upper bound for our problem (except perhaps to speed up the NP verification a bit). I ask only because I find the question interesting, and wasn't able to cook up any counterexamples in my brief attempt.

Labels: ,

Wednesday, June 15, 2011

Joint computational complexity, and the "buy-one-get-one-free conjecture"

Below is a simple-to-state open question, stemming from this paper of mine from CCC'09. First, I'll state the question; then I'll give some background, explaining how it's an instance of a more general and significant problem.


The question


Let's consider the standard two-party model of communication complexity. Given inputs x and y to Alice and Bob respectively, suppose there are 3 functions the two parties are interested in evaluating on these inputs---let's call them F(x, y), G(x, y), H(x, y).


Question: is there a collection of total functions F, G, H, and a positive value T, such that:


(i) any one of F, G, H requires at least T bits of communication to compute;


(ii) any two of F, G, H can be computed in (1.01 T) bits of communication, on a common input (x, y);


(iii) but, computing all three of F, G, H on a common input requires at least (1.99 T) bits of communication.


I believe such a collection exists. We can call this the 'buy-one-get-one-free conjecture': Think of T as the individual 'price' of the 'items' F, G, H; we want to arrange a special 'deal' where the second item is essentially free, but one has to pay full-price for the third item.


Now if you think about it, what we're looking for is pretty strange. The function F should be efficiently computable in at least two 'essentially different' ways---one of which also gives us the value of G, and one of which gives H---yet there should be no efficient scheme to compute F that gives us G and H simultaneously. (This property seems easier to contrive when the inputs x, y are assumed to have a special, correlated form; I rule this out by insisting that F, G, H be total functions.)


The question makes equal sense when posed for other models of computation. In my paper, I proved the corresponding conjecture in the decision tree model of computation, as a special case of a more general result--see below. Communication complexity could be a reasonable model to attack next.


Please note: While this conjecture may require lower-bounds expertise to resolve, I believe that anyone with a creative spark could make an important contribution, by coming up with a good set of candidate functions F, G, H. Please feel encouraged to share any ideas you might have.


Background on the question


Let cc(F) denote the (deterministic) communication complexity of computing F(x, y). Next, let cc(F, G) denote the communication complexity of computing F(x, y) and G(x, y)---on the same input-pair (x, y). We define cc(F, H), cc(G, H), and cc(F, G, H) similarly.


Together, we think of these various quantities as summarizing the 'joint complexity' of the collection F, G, H. Of course, this notion can be extended to collections of k > 3 functions; the joint complexity is summarized by giving the communication complexity of all 2^k subsets of the collection. Let's let JC denote the function that takes as input a k-bit vector, and returns the complexity of computing the corresponding subcollection. So, in our 3-function example, we have

JC(1, 1, 0) = cc(F, G) and JC(0, 0, 1) = cc(H).


The question we want to ask is: what kinds of behavior are possible with the joint complexity, if we allow the functions F, G, H, etc. to be chosen arbitrarily? In other words, what different types of 'efficiencies' can arise in a collection of computational tasks (in the communication model)?


A little thought reveals some obvious constraints:


1. the joint complexity function JC must always be nonnegative and integral-valued, with JC(0) = 0.


2. monotonicity: Enlarging the subset of the functions to be computed cannot decrease the complexity. For example, we always have cc(F, G) >= cc(F), which translates to JC(1, 1, 0) >= JC(1, 0, 0).


3. subadditivity: Taking the union of two subsets of functions to be computed cannot increase the complexity beyond the sum of the individual complexities of the subsets. For example, cc(F, G, H) <= cc(F, G) + cc(H), since we can always compute (F, G) in an optimal fashion first, then compute H optimally afterwards.

(Technically, this assumes that in our model both players always know when a communication protocol halts, so that they can combine two protocols sequentially without any additional overhead. No big deal, though.)


Now, a little further thought reveals that… well, there really aren't any other obvious, general constraints on the joint complexity! Let's call C an Economic Cost Function (ECF) if it obeys constraints 1-3. We are tempted to conjecture that perhaps every ECF is in fact equal to the joint complexity (in the communication model) of some particular collection of functions.


There are two things wrong with this conjecture. First, it's false, as can be seen by a simple counterexample: namely, the "buy-one-get-one-free" example, with T set to 1. That's how I stumbled onto this example, and is one reason why I find it interesting.

However, if we relax the problem, and just ask to realize some scalar multiple of C as a joint complexity function, this counterexample loses its force.


The second thing wrong with the conjecture (in its relaxed form) is that, even if true, it'd likely be impossible to prove. This is because determining the exact computational cost of even modestly-complicated tasks is just way too hard. So I propose a doubly-relaxed form of the conjecture: I conjecture that if C is an ECF, then there is a joint complexity function that is a good pointwise approximation to some scalar multiple of C. (Here we allow a (1 +- eps) multiplicative error.)


In my paper, I managed to prove the corresponding conjecture for the model of decision trees (aka deterministic query algorithms). Several interesting ingredients were needed for the proof. Now, why do I believe the conjecture should also hold true for the communication model? In a nutshell, I think it should be possible to 'embed' tasks in the query model into the communication model, by a suitable distributed encoding of each bit, in such a way that the relative costs of all computational tasks are approximately preserved. If this could be shown, the result in the communication model would follow from my result for decision trees. (See the paper for more details.)


We may not be ready for an attack on the general conjecture, however. In particular, we seem to require a much better understanding of so-called 'Direct Sum problems' in communication complexity. Thus, I offer the 'buy-one-get-one-free conjecture' as a simpler, more concrete problem on which we can hope to make progress sooner.


In the decision tree model, my result allows us to realize an ECF of the 'buy-one-get-one-free' type as a joint complexity function; but I don't know of any method for this that's significantly simpler than my general construction. Even finding such a simpler method in the decision tree model would be a very nice contribution, and might lead to new ideas for the more general problem.

Labels:

Monday, May 09, 2011

An exciting new textbook, and a request

Today I'd like to put out an appeal to readers. If you have a solid grasp of English and an interest in circuit complexity (no expertise required!), please consider helping proofread the forthcoming book "Boolean Function Complexity: Advances and Frontiers" by Stasys Jukna.

Stasys (whom I recently had the pleasure to meet at an enjoyable Dagstuhl seminar in Germany) is a talented researcher and a kind, gracious person; he's also worked tirelessly to produce high-quality textbooks for our community. I'm a long-time fan of his "Extremal Combinatorics: With Applications in Computer Science" which contains many gems of combinatorial reasoning in complexity theory.

His latest manuscript, to be published soon, promises to be a leading reference for circuit lower-bounds research. Although I've only read parts, it clearly achieves both depth and breadth; I really think anyone in the field could learn something new and useful here.

The main cause for concern is that Stasys (who hails from Lithuania) is not a native English speaker, and the text needs work in places to become grammatically correct and idiomatic. Also, it seems a full-time copy-editor is not available for this project.
So readers: by volunteering to proofread a chapter or two, you'll be doing a valuable service for present and future students of complexity theory. Language editing is the essential thing--you can skim over the equations, so it really doesn't take that long. (Of course, mathematical feedback is also welcome.)

The manuscript is available here; use the following login info:
User: friend
Password: catchthecat

You can email Stasys your comments. The text is long (500+ pages), so if you do or are planning to do some editing, please enter your name, the chapters you'll edit, and a timeframe in the comments below. This will allow others to maximize our coverage of the draft.

Thanks in advance for your help!

Labels:

Friday, December 10, 2010

Harassment Policies for Theory Conferences

Following offline conversations and recent discussions on other blogs (hat-tip to Anna and David), I want to promote the Geek Feminism Blog initiative asking computing conferences to adopt explicit policies against sexual harassment. Bringing such policies to theory conferences that don't yet have them is an important step. (Note that this would mean putting them on conference websites and preparing conference staff. Just having some boilerplate document hidden somewhere on the IEEE or ACM websites is not enough.)

What is the value of such a policy? Geek Feminism provides a policy template whose intro spells it out well. Such a policy

"sets expectations for behavior at the conference. Simply having an anti-harassment policy can prevent harassment all by itself...

"...it encourages people to attend who have had bad experiences at other conferences...

"...it gives conference staff instructions on how to handle harassment quickly, with the minimum amount of disruption or bad press for your conference."

Stating such a policy would cost nothing, and local conference staff could prepare for their roles using anti-harassment training materials, which abound on the web -- I invite others to suggest good ones.

So why would we hesitate to adopt such policies? I will suggest three possible reasons, and explain why they're unconvincing.

First, there is a certain tendency to deride anti-harassment training as "sensitivity training" and as stating the obvious. But whether or not most of us know how to treat others respectfully, responding to disrespectful treatment is another story. Conference staff need to know there are circumstances under which they can and should reprimand attendees or even eject them, and they need to mentally rehearse for these difficult tasks. Attendees need to know the staff are ready to help.

Second, some might object that while harassment may be a major problem in other parts of the computing/tech world, it's less of a problem in our mature, enlightened theory community. Of course, this would be a self-serving belief without empirical support. I'm not aware of any systematic efforts to track harassment incidents at theory conferences, although Geek Feminism maintains wiki record of incidents in computing/tech more broadly -- I hope theory conference-goers will find and use it or something similar. But if we can agree that sexual harassment is seriously wrong -- harmful to individuals and the community when it occurs -- surely we can take the time to state this publicly and prepare ourselves to deal with it, whatever its frequency.

Third, might an anti-harassment policy inhibit our freedom of expression too much or make people afraid to interact? Let me turn this question around. Almost all universities and major employers have explicit anti-harassment policies (here's MIT's, for example). Most of us support these precautions and don't feel oppressed by the policies. Why should conferences, which are outgrowths of the academic system, be different? Do we believe there is some special spirit of lawlessness that we need to protect at conferences, and only at conferences?

Of course not. So I support the harassment-policy initiative, and encourage others to do so as well.

Thursday, November 04, 2010

ECCC: what authors should know

Anyone interested in computational complexity should be aware of ECCC, the most important and widely-used online repository for complexity papers. (Depending on your specific interests, various sections of arxiv, along with the Cryptology eprint Archive, may be equally important to follow.)

Unfortunately, the technical side of ECCC's submission process is arguably broken, and seems to trip up many if not most authors. Here are the issues I'm aware of:

1: no preview function.

Unlike arxiv, ECCC offers Latex support for on-site abstracts, so you can have as many funky symbols as you want. Good thing? No, because the site offers no way to preview what the compiled math will look like. This results in bizarre spacing effects and compile errors. (The most frequent problem, I think, is authors trying to use their own custom macros in the abstract, or commands from outside the Latex fragment supported by the site.)

Nor is it possible to preview what your document will look like (assuming it's accepted). This brings us to the second point:

2: hyperrefs are broken.

Many authors these days like to use internal hyperlinks in their document (provided by the hyperref package in Latex). This way, in case the reader forgets what Lemma 14.5 said, there's a link to it every time it's invoked. ECCC is happy to accept hyperref'd papers, and when they appear on the site, they'll have the same appearance you've chosen to give them. Unfortunately, in most cases the damn things won't do anything when you click on them.

Faulkner wanted to print The Sound and the Fury in multi-colored ink. I happen to like the look of colorful hyperrefs, even broken ones, but I still feel like a fool when they're all over a paper of mine.

3: keywords are nearly useless.

There is so little standardization in the use of keywords, and they're handled so rigidly, that clicking on them is often a waste of time. For example, the keywords 'algorithmic meta-theorems' and 'algorithmic meta theorems' bring up one paper each -- two lonely souls separated by a hyphen. (The search tool and the browse-able list of keywords are somewhat more useful, but still probably less so than googling.) Mistyped keywords are another danger. I've also seen author names that, when clicked, bring up a proper subset of that author's work for no apparent reason.

Why does this matter?

I think all this constitutes a serious problem. But one possible objection to my view is that authors can always post revisions to their papers -- fixing abstracts, documents, and keywords in one stroke.

This might be an OK solution, except for the empirical fact that almost nobody does this (myself included). I think, and others have also opined, that this is because people are afraid to revise. Presumably, they fear there's a widespread perception that posting revisions = mistakes or sloppiness.

Whether or not this perception is actually widespread, we should stand visibly against it, to loosen the grip of pointless anxieties and to improve the quality of available papers. A more reasonable attitude is that having early access to preprints is a good thing, but that such papers will almost always have imperfections. Revising a paper within a few weeks or months of its release ought to be a sign of conscientious authors, not sloppy ones. This holds doubly in the context of a messed-up submission system.
Of course, there is such a thing as too many revisions, and it is possible to post a paper too early in the editing process. Where that line should be drawn is a tough topic that deserves its own discussion.

What's the solution?

We can and should expect more from ECCC. Specifically, a preview function for abstracts and documents would be a key improvement. But at the least, there should be a clear list of warnings to authors about common submission errors.

The web administrators are aware of these issues, and have been for some time; but we as users can each do our part to communicate the importance and urgency of fixing these problems. This is especially true of users on the site's scientific board.

In the meantime, what should you do if your submission doesn't turn out the way you expected? Last time this happened to me, I contacted a web admin, a friendly guy who was able to fix part of the problem for me, without resorting to the dreaded revision step. This might work for you as well.

Wednesday, July 21, 2010

Injective polynomials

From a paper of MIT's Bjorn Poonen, I learned of an amazingly simple open problem. I'll just quote the paper (here Q denotes the rational numbers):

"Harvey Friedman asked whether there exists a polynomial f(x, y) in Q(x, y) such that the induced map Q x Q --> Q is injective. Heuristics suggest that most sufficiently complicated polynomials should do the trick. Don Zagier has speculated that a polynomial as simple as x^7 + 3y^7 might be an example. But it seems very difficult to prove that any polynomial works. Both Friedman's question and Zagier's speculation are at least a decade old... but it seems that there has been essentially no progress on the question so far."

Poonen shows that a certain other, more widely-studied hypothesis implies that such a polynomial exists. Of course, such a polynomial does not exist if we replace Q by R, the reals. In fact any injection R x R --> R must be (very) discontinuous.

Suppose an injective polynomial f could be identified, answering Friedman's question; it might then be interesting to look at `recovery procedures' to produce x, y given f(x, y). We can't hope for x, y to be determined as polynomials in f(x, y), but maybe an explicit, fast-converging power series or some similar recipe could be found.

Finally, all of this should be compared with the study of injective polynomials from the Euclidean plane to itself; this is the subject of the famous Jacobian Conjecture. See Dick Lipton's excellent post for more information.

Labels: